FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Redhat > Device-mapper Development

 
 
LinkBack Thread Tools
 
Old 04-27-2010, 08:56 PM
Douglas McClendon
 
Default semop failed for cookie?

Hi,

I have a user of an installation tool of mine that is hitting this
message, with a very recent pre-fedora-13 kernel.


zyx-liveinstaller-cli: creating temporary rootfs virtual duplicate
device-mapper: resume ioctl failed: Invalid argument
semid 65536: semop failed for cookie 0xd4d423c: incorrect semaphore state
Failed to set a proper state for notification semaphore identified by
cookie value 223167036 (0xd4d423c) to initialize waiting for incoming
notifications.

Command failed
zyx-liveinstaller-cli: error: failed to create temporary rootfs virtual
duplicate


The error is happening at line 742 in this monster bash script (which
I'm pretty sure does work on f11 and f12)


http://filteredperception.org/dawg/projects/zyx-liveinstaller/src/zyx-liveinstaller-latest/rli/zyx-liveinstaller-cli.html

http://filteredperception.org/dawg/projects/zyx-liveinstaller

Now, I've already urged the sugar on a stick project to primarily pursue
the path of the standard fedora livecd/usb installer, for many
additional reasons. But, I guess I would still like my rebootless
installer to actually work. And this may be one of those great corner
case issues that developers love. I.e. a very obscure tickling of some
code path. Or maybe I'm just being too lazy to read the accompanying
dmsetup man page for some clue as to what I should be doing when I see
such a bizarre error message. Or, some other bug in the script is
causing the arguments to become malformed. Still, the crypticness of
the message from devicemapper suggests something you folks might want to
improve. In any event, FWIW, there it is...


-dmc

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 04-27-2010, 10:33 PM
Alasdair G Kergon
 
Default semop failed for cookie?

On Tue, Apr 27, 2010 at 03:56:57PM -0500, Douglas McClendon wrote:
> I have a user of an installation tool of mine that is hitting this
> message, with a very recent pre-fedora-13 kernel.

udev is now involved in this process.
Check they have up-to-date lvm2 and udev packages and that they've not
tried to customise their udev rules - if they have, you'll need to
check their changes didn't break things.

Big script.

Debug it by adding lines to dump the state immediately before the problem
command, then immediately after it.

Dump state by running 'dmsetup info -c', 'dmsetup table', 'dmsetup status'
and 'dmsetup udevcookies'.

If that still doesn't help, break the 'dmsetup create' command down into
its three constituent commands (dmsetup create --notable, dmsetup load,
dmsetup resume) and dump the state between each of them and confirm
which is failing.

Alasdair

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 04-28-2010, 03:52 AM
Douglas McClendon
 
Default semop failed for cookie?

On 04/27/2010 05:33 PM, Alasdair G Kergon wrote:

On Tue, Apr 27, 2010 at 03:56:57PM -0500, Douglas McClendon wrote:

I have a user of an installation tool of mine that is hitting this
message, with a very recent pre-fedora-13 kernel.


udev is now involved in this process.
Check they have up-to-date lvm2 and udev packages and that they've not
tried to customise their udev rules - if they have, you'll need to
check their changes didn't break things.


Thanks for the reply and the advice. I'm not so interested in the issue
that I'll necessarily get to it very soon, but what you said will no
doubt help.


One thing though, which may not have been obvious, and almost sounds
dubious, is what I'm actually doing there. I'll try to describe it in
words here, to see if this shouts out as a situation that may no longer
be expected to work (because honestly I was probably pretty pleased and
half-surprised to discover that it did work 3 years ago)-


Basically the livecd mode you should be familiar with. ext3 image on a
loop device. cow file in tmpfs on a loop device. Combined with
dm-snapshot, resulting in what is used as the rootfs device. Simple enough.


So what I do (and this is dusty code I haven't payed attention to in a
long time, so maybe I'm misunderstanding my own code, but probably not)
is this-


1) with that dmsetup create that is now failing, I first create a
duplicate device (different name, same table) as the one that the
rootfs. I.e. another snapshot device with the same components/table.


2) I use a reload --table on the device that is the rootfs, to replace
it with a new table, that is a mirror of the device created in (1) and
the target normal hard disk partition that the script is installing the
OS to.


3) I do a resume on the rootfs device such that the new table with the
mirror activates, and the migration starts to occur


4) when the mirror completes, I do another reload then resume with a new
linear table pointing to the newly installed fs on normal disk
partition. Then I tear down all the unused original devices.


So, if something about this description screams out- the new udev
semantics will prevent (1) from working, let me know.





Big script.

Debug it by adding lines to dump the state immediately before the problem
command, then immediately after it.

Dump state by running 'dmsetup info -c', 'dmsetup table', 'dmsetup status'
and 'dmsetup udevcookies'.

If that still doesn't help, break the 'dmsetup create' command down into
its three constituent commands (dmsetup create --notable, dmsetup load,
dmsetup resume) and dump the state between each of them and confirm
which is failing.


Sounds good. Again, what I'm doing with two devices with the same table
smells like something that might have been inadvertently allowed before
and now not. Or maybe other people do it all the time for other reasons
I'm not considering right now.


-dmc




Alasdair

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 04-28-2010, 09:38 AM
Peter Rajnoha
 
Default semop failed for cookie?

On 04/27/2010 10:56 PM, Douglas McClendon wrote:
> zyx-liveinstaller-cli: creating temporary rootfs virtual duplicate
> device-mapper: resume ioctl failed: Invalid argument
> semid 65536: semop failed for cookie 0xd4d423c: incorrect semaphore state
> Failed to set a proper state for notification semaphore identified by
> cookie value 223167036 (0xd4d423c) to initialize waiting for incoming
> notifications.

Well, the primary cause is that "resume" ioctl that is failing
(can you trace the exact parameters that are substituted in the
script for that failing dmsetup call?). I think the errors printed
afterwards are just an outcome of this failure.

Anyway, it seems that our internal "_udev_complete" fn is called more
than once on some error path. This call is exactly the same as calling
"dmsetup udevcomplete", but we have to call one internally if any error
occurs while processing a device-mapper task (that generates udev events).
That's because we can't await any notification for failed ioctls since
no udev events will be generated. We need to do that to prevent
infinite waiting for notifications that will never come.

If that internal _udev_complete is called more than necessary, we'll
get into an improper state with the semaphore so that needs to be
fixed!

What's the exact version of dmsetup/lvm2 used? Also, in addition to
Alasdair's hints in the other post, could you please run the failing
dmsetup with verbose output "dmsetup -vv ...". This way we should
see how the semaphore is handled throughout processing..

Peter

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 04-28-2010, 11:11 PM
Douglas McClendon
 
Default semop failed for cookie?

On 04/27/2010 05:33 PM, Alasdair G Kergon wrote:

On Tue, Apr 27, 2010 at 03:56:57PM -0500, Douglas McClendon wrote:

I have a user of an installation tool of mine that is hitting this
message, with a very recent pre-fedora-13 kernel.


udev is now involved in this process.
Check they have up-to-date lvm2 and udev packages and that they've not
tried to customise their udev rules - if they have, you'll need to
check their changes didn't break things.

Big script.

Debug it by adding lines to dump the state immediately before the problem
command, then immediately after it.


Actually, I just grabbed the latest soas nightly livecd build, which for
these purposes should presumably be considered the same as rawhide.


I tried manually do do what I described. I.e. make a duplicate (same
table, different name) snapshot device.


Interestingly, I'm not seeing the semop cookie thing, but now after the
'resume ioctl failed' message, I checked dmesg, and I'm seeing-


device-mapper: snaphots: Unable to perform snapshot handover until
source is suspended.


Also, this is under virtualization, which, as with other fedora dev
builds I've seen, runs bizarrely slowly. I.e. I had a couple text root
logins timeout because it didn't finish whatever it needed to finish in
60 seconds. And while booting I saw dozens of weird udev failure
messages. But I'm thinking that may have nothing to do with the issue,
and hoping the above message elicits an explanation. I.e. is what I'm
doing somehow inadvertently utilizing the new snapshot merging semantics
even though it wasn't before, and for my purposes shouldn't?


-dmc




Dump state by running 'dmsetup info -c', 'dmsetup table', 'dmsetup status'
and 'dmsetup udevcookies'.

If that still doesn't help, break the 'dmsetup create' command down into
its three constituent commands (dmsetup create --notable, dmsetup load,
dmsetup resume) and dump the state between each of them and confirm
which is failing.

Alasdair

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 04-29-2010, 12:00 AM
Alasdair G Kergon
 
Default semop failed for cookie?

On Wed, Apr 28, 2010 at 06:11:46PM -0500, Douglas McClendon wrote:
> device-mapper: snaphots: Unable to perform snapshot handover until
> source is suspended.

It has never been OK to have the same snapshot metadata in use
simultaneously in two targets at once (because of caching in memory).
It's the responsibility of userspace to adhere to the correct semantics
or live with the potential data corruption if they are violated. It
sounds like your process may fall into that second category.

Part of the process of adding snapshot merging support involved
providing a controlled method for handing over the snapshot metadata
from one target instance to another.

If you are trying to move a snapshot from one target to another, then
you must either deactivate the snapshot first (older kernels) or (newer
kernels) make use of the 'snapshot handover' mechanism as the message
suggests.

Alasdair

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 04-29-2010, 03:32 AM
Douglas McClendon
 
Default semop failed for cookie?

On 04/28/2010 07:00 PM, Alasdair G Kergon wrote:

On Wed, Apr 28, 2010 at 06:11:46PM -0500, Douglas McClendon wrote:

device-mapper: snaphots: Unable to perform snapshot handover until
source is suspended.


It has never been OK to have the same snapshot metadata in use
simultaneously in two targets at once (because of caching in memory).
It's the responsibility of userspace to adhere to the correct semantics
or live with the potential data corruption if they are violated. It
sounds like your process may fall into that second category.


Yeah, I had a hunch I was 'getting away with something'. I.e. I wasn't
acutely aware of the 3 differentiatable phases of the create. In my
case, literally nothing would happen while both instances were 'live',
except the handoff. And it seemed to work very reliably. I.e. I would do


dmsetup create --table="same as rootfs's table (a snapshot)" rootfs-copy
dmsetup reload --table"mirror, goodside rootfs-copy, badside harddisk"
rootfs

dmsetup resume rootfs

I guess I'll have to learn the new snapshot handover stuff. Just let me
know if you suspect some impossibility here. Or if there is a magic
--live-dangerously flag I can use


-dmc




Part of the process of adding snapshot merging support involved
providing a controlled method for handing over the snapshot metadata
from one target instance to another.

If you are trying to move a snapshot from one target to another, then
you must either deactivate the snapshot first (older kernels) or (newer
kernels) make use of the 'snapshot handover' mechanism as the message
suggests.

Alasdair


--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 04-29-2010, 04:23 PM
Alasdair G Kergon
 
Default semop failed for cookie?

On Wed, Apr 28, 2010 at 10:32:54PM -0500, Douglas McClendon wrote:
> case, literally nothing would happen while both instances were 'live',

If you had no data written to the snapshot or origin while both
were loaded, you might have got away with it. But I think the new handover
code give you a safe and supported mechanism now.

Alasdair

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 05-03-2010, 01:36 AM
Douglas McClendon
 
Default semop failed for cookie?

On 04/29/2010 11:23 AM, Alasdair G Kergon wrote:

On Wed, Apr 28, 2010 at 10:32:54PM -0500, Douglas McClendon wrote:

case, literally nothing would happen while both instances were 'live',


If you had no data written to the snapshot or origin while both
were loaded, you might have got away with it. But I think the new handover
code give you a safe and supported mechanism now.

Alasdair


I've got a "BUG" for you-

So, I tried on a nightly soas livecd iso build, booted under qemu
(should be the same for a rawhide or fedora13 beta i386 livecd iso,
booted to runlevel 1 (for simplicity sake))


I tried to suspend the snapshot device holding the rootfs, thinking that
I might be able to do that, then resume the newly created with a new
name copy of that device, and then resume the rootfs device after
loading a new table of a mirror of the copy device and the destination
partition.


But the instant I suspend the snapshot device containing the rootfs
(dmsetup suspend live-rw), I get-


BUG: lock held when returning to user space!

dmsetup/865 is leaving the kernel with locks still held!
1 lock held by dmsetup/865:
#0: (&journal->j_barrier){+.+...+}, at: [<c056b84d>] jbd2_journal_lock_
updates+0xbd/0xc5

--------------- (manually transcribed) ------------

Now, my first guess as to how to proceed would be to try to get a
statically linked dmsetup copied into a tmpfs. Which, given the
particular target and my lack of enthusiasm for all of this, may take me
some time to try. Any other advice? Note, I could craft some manual
dmsetup commands to reproduce what I'm trying to do, that would apply to
any fedora-13/soas livecd iso. But for the sake of argument, lets
pretend that all I want to do is to run (dmsetup suspend live-rw ;
dmsetup resume live-rw) and have the system not fall over dead.


Also, to reiterate again, I got that message in dmesg about a problem
with the snapshot handover while trying to use my previously working but
not guaranteed to work 100% method. But note that method does not
involve snapshot merging at all, which from the documentation I found
(perhaps I didn't look in all the right places), is the only place that
the snapshot handover is related to.


Note, I'm not complaining, as this is very low priority for me, but
rather just doing my best to explain the issue.


-dmc


--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 

Thread Tools




All times are GMT. The time now is 07:38 PM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org