Linux Archive

Linux Archive (http://www.linux-archive.org/)
-   Device-mapper Development (http://www.linux-archive.org/device-mapper-development/)
-   -   Send KOBJ_ADD event after dm resume ioctl. (http://www.linux-archive.org/device-mapper-development/344075-send-kobj_add-event-after-dm-resume-ioctl.html)

David Zeuthen 03-19-2010 01:59 PM

Send KOBJ_ADD event after dm resume ioctl.
 
Hey,
On Fri, Mar 19, 2010 at 10:34 AM, Alasdair G Kergon <agk@redhat.com> wrote:

On Fri, Mar 19, 2010 at 09:58:00AM -0400, David Zeuthen wrote:

> I think the problem is the that fact that 3rd party user space

> opens the device before it is ready (e.g. just after ADD but before

> the first CHANGE) makes things fall over.

> This short-coming is what needs to get fixed, I think - it's very

> fragile this way and since any random user / package can add

> rules to open the device on add events, said user / package can

> make device-mapper fail. Which doesn't exactly strike me

> as robust behavior.



And we suggested two potential solutions:



*1 - change the kernel so the ADD event doesn't arrive until the device is

ready for use. * * [plus equivalent change for REMOVE]



* * *Advantage: the dm device handling looks more like a real disk so we have

less 'special case' code. */dev then only indexes "dm devices ready to be used"

rather than "dm devices registered in the kernel"



* * *Disadvantage: breaks the currently-simple kobject/sysfs/dev linkage (as

per Kay's earlier mail)*




*2 - several changes to the way udev rules are handled so we can choose to

ignore events and make no changes to /dev, so we can override rules other

packages insert without requiring dm-specific checks adding to them all, and

probably some of the other things we've discussed on these various threads.

I don't think it's realistic to assume that user space will read and honorsomething like a*DM_UDEV_DISABLE_DM_RULES_FLAG property - while
we can make the udev package and other "core" OS packages do this, rulesfrom users, sites and random 3rd party packages will open() the deviceon "add" (and, if properly written, also on "change").

It's much better to make that operation somehow gracefully fail in thewindow where the device isn't configured. My understanding is thatwe are not doing this today - for example, suppose I have this udev
rule
**SUBSYSTEM=="block", ACTION="" **IMPORT{program}="foo-check-for-some-signature $tempnode"
where*foo-check-for-some-signature is some program to check for
a fs signature (say, for a properietary fs on a portable musicplayer device) and, if, so, set the FOO_* properties with specificsabout the device (these will be used in the UI to control/manage
the portable music player).
Things like this should work for any block device, no matter whatstate it's in. Sure, for device-mapper block devices open() orread() on the device*may fail in the window between "add" and
"change" but that's fine.
What can never happen though, is that this configuration of thedevice-mapper device somehow randomly fails because theprogram foo-check-for-some-signature tries to open every block
device.
(Sure, user space can be _clever_ and save the fork+exec+openby checking*DM_UDEV_DISABLE_DM_RULES_FLAG from anudev rule - but that's optimization, not something required for
correctness.)
I haven't checked if this problem still exists with device-mapperbut I know, in the past, that it has - and IIRC it was the reasonthat you introduced*DM_UDEV_DISABLE_DM_RULES_FLAG
after the udev ignore_device directive was removed.
I'd like to reiterate that it's actually not a problem that the sequence is
*- "add" uevent
*- device is not usable, access to the device fails gracefully*- "change" uevent*- device usable, blkid on the device etc. works
the point really is that you have to accept that there will exist
user space programs that does things on the device between"add" and the initial "change" uevent.
The other problem, the assumption that "change" uevents only
originates from libdevmapper, is separate from this problem.
** * David

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

David Zeuthen 03-19-2010 02:14 PM

Send KOBJ_ADD event after dm resume ioctl.
 
On Fri, Mar 19, 2010 at 11:05 AM, Peter Rajnoha <prajnoha@redhat.com> wrote:

As for the variables, we already carry some important
information within uevents.



Generally, it carries hints for udev rules to instruct them

how they should be applied correctly, which parts should be

run based on the type of the device, it's real meaning with all

relations to other devices taken into account within that

DM subsystem used (e.g. LVM2's snapshots, mirrors...).



Most of this information is really not suitable to be stored

as a sysfs attribute since it deals with userspace notions,

an abstraction layer above device-mapper...

Presumably this information originates from user space whensetting up the device-mapper device, right? Why*can't*you
simply store it in, say, /var/run/device-mapper?
(Or, better, store it in /dev/.device-mapper/ to avoid hittingthe real disk - /dev is guaranteed to be on tmpfs)

Thanks,David

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

Mike Snitzer 03-19-2010 02:55 PM

Send KOBJ_ADD event after dm resume ioctl.
 
Hi David,

On Fri, Mar 19 2010 at 9:58am -0400,
David Zeuthen <zeuthen@gmail.com> wrote:

> Hey Kay,
>
> On Fri, Mar 19, 2010 at 9:43 AM, Kay Sievers <kay.sievers@vrfy.org> wrote:
>
> > On Fri, Mar 19, 2010 at 14:24, Peter Rajnoha <prajnoha@redhat.com> wrote:
> > > On 03/19/2010 10:24 AM, Kay Sievers wrote:
> > >> No, that's what "change" is for, and we already have these "change"
> > >> events for dm. Udev does not care if the device is ready or not, it
> > >> synchronizes /sys and /dev, and that works just fine with "change"
> > >> events.
> > >
> > > CHANGE events, not quite... We can't even rely on these.
> > >
> > > Just to mention, there's also a CHANGE event generated when
> > > read-only flag is set for a device (this is not managed by
> > > device-mapper of course). This one is generated even before
> > > the actual CHANGE event that is generated when DM device is
> > > ready to be used.
> >
> > Sure, but as mentioned earlier, these events are just expected to
> > fail, and update the current udev state, if they can't retrieve the
> > needed information or find out that the device in not usable.
> >
>
> I think the problem is the that fact that 3rd party user space
> opens the device before it is ready (e.g. just after ADD but before
> the first CHANGE) makes things fall over.
>
> This short-coming is what needs to get fixed, I think - it's very
> fragile this way and since any random user / package can add
> rules to open the device on add events, said user / package can
> make device-mapper fail. Which doesn't exactly strike me
> as robust behavior.

When I first read this response I thought we had a major break-through,
namely: udev allowing udev rules to race with the tool that is making
the device usable was not "robust behavior".

But your 2nd mail in this thread established that I had wishful thinking
on that so-called break-through.

At least we agree that these uevents are causing DM to race against
arbitrary udev rules; which leads to sporadic failures.

I think I understand udev's utopian intent to have all udev rules be
able to do as they wish with any device: said access should "fail
gracefully" on devices that aren't ready.

Thing is, this isn't scalable at all. Having all these arbitrary rules
issuing IOs to devices that aren't usable is a complete waste of time.
On enterprise systems that have 100s (*shudder* 1000s) of LUNs, this
udev rules' freedom to access such unusable devices is really working
against us (if the goal is to activate devices as quickly and reliably
as possible).

We at least need a way to _reliably_ allow DM to do its work of managing
its devices. What if udev were to offer a per device "udev rules lock"
(exposed via sysfs?) that allows subsystems (e.g. DM) to know they can't
yet proceed with exclussively accessing the device they are tasked with
managing?

This per device "udev rules lock" would at least allow DM to cope with
the racey nature of udev rules. Not ideal as it still allows
inefficient (and unecessary) access to devices that shouldn't be touched
but it would at least be a means to an end (or so I'd think).

Mike

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

David Zeuthen 03-19-2010 03:01 PM

Send KOBJ_ADD event after dm resume ioctl.
 
On Fri, Mar 19, 2010 at 11:24 AM, Alasdair G Kergon <agk@redhat.com> wrote:

On Fri, Mar 19, 2010 at 10:59:12AM -0400, David Zeuthen wrote:

> I'd like to reiterate that it's actually not a problem that the sequence is

> *- "add" uevent

> *- device is not usable, access to the device fails gracefully

> *- "change" uevent

> *- device usable, blkid on the device etc. works

> the point really is that you have to accept that there will exist

> user space programs that does things on the device between

> "add" and the initial "change" uevent.



It is a problem if both:

* *1) something attempts to open the devices in that window

AND 2) we have no mechanism to wait for it to finish



Also note that if it is permitted to run 'udevadm trigger' at

any time that also causes a synchonisation problem here: our

application code doesn't even know there is something conflicting

that it needs to wait for. *[Even though we want to ignore ADD,

we may still need to synchronise against it.]



So again, the 'don't issue ADD until device is usable' offers a simple

way of avoiding this class of problems. *
But as Kay said it's a horrible horrible hack.
If you wanted to solve the*problem kernel-side I'd expect you to
create a new subsytem*for device-mapper (let's call it 'kdm') thatlibdevmapper would talk to. Then you'd only*present block deviceobjects exactly when the device-mapper*side has been configured.

Then the flow of events could be something like this
*- add device kdm-0 *(subsys 'kdm')*- libdevmapper configures kdm-0*- add device dm-0 (subsys 'block')

but I don't know if this is better.*The 'private device

attribute' we suggested offers another - all would be expected

to respect a 'private' sysfs attribute.

Sorry, but this is an even worse hack.*Mandating*that all of userspace (that is: past, present and future) needs to read some
random 'private'*attribute in sysfs because of weird life-cycleissues in the device-mapper implementation... that's not reallyworkable.
Anyway, I really don't think you can expect user space to behave
sanely so it's not really worth trying.
(I don't think you ever could expect user space to behave sanely,but I'll note that it's*an even bigger problem now that we have
powerful frameworks*(such as udev) allowing people to run codeat device discovery*time.... I mean, device-mapper have probablybeen suffering*from these issues from day 1 - it just wasn't visible
earlier*on because we didn't have uevents...)
Thanks,David

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

David Zeuthen 03-19-2010 03:08 PM

Send KOBJ_ADD event after dm resume ioctl.
 
On Fri, Mar 19, 2010 at 11:55 AM, Mike Snitzer <snitzer@redhat.com> wrote:

Thing is, this isn't scalable at all. *Having all these arbitrary rules

issuing IOs to devices that aren't usable is a complete waste of time.

On enterprise systems that have 100s (*shudder* 1000s) of LUNs, this

udev rules' freedom to access such unusable devices is really working

against us (if the goal is to activate devices as quickly and reliably

as possible).

Please, Mike, whether to probe devices or not is a separate discussionand it's*not really helpful to reiterate that discussion.
(Please do see various mailing lists archives*for*that particular discussion;
FWIW, you will notice that what udev is*doing is actually desirable. If*youexamine the kernel sources, you will find that the kernel itself*submitsIO to the block devices - partition table probing for starters. If you search
harder you will find *raw numbers* (instead of speculation like "100s(*shudder* 1000s)" showing what udev is doing is not a problem...)
** *David

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


All times are GMT. The time now is 10:30 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.