FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Redhat > Device-mapper Development

 
 
LinkBack Thread Tools
 
Old 07-11-2011, 10:40 PM
Mike Snitzer
 
Default block: Check that queue is alive in blk_insert_cloned_request()

[cc'ing dm-devel, vivek and tejun]

On Fri, Jul 8, 2011 at 7:04 PM, Roland Dreier <roland@kernel.org> wrote:
> From: Roland Dreier <roland@purestorage.com>
>
> This fixes crashes such as the below that I see when the storage
> underlying a dm-multipath device is hot-removed. *The problem is that
> dm requeues a request to a device whose block queue has already been
> cleaned up, and blk_insert_cloned_request() doesn't check if the queue
> is alive, but rather goes ahead and tries to queue the request. *This
> ends up dereferencing the elevator that was already freed in
> blk_cleanup_queue().

Your patch looks fine to me:
Acked-by: Mike Snitzer <snitzer@redhat.com>

And I looked at various code paths to arrive at the references DM takes.

A reference is taken on the underlying devices' block_device via
drivers/md/dm-table.cpen_dev() with blkdev_get_by_dev(). open_dev()
also does bd_link_disk_holder(), resulting in the mpath device
becoming a holder of the underlying devices. e.g.:
/sys/block/sda/holders/dm-4

But at no point does DM-mpath get a reference to the underlying
devices' request_queue that gets assigned to clone->q (in
drivers/md/dm-mpath.c:map_io).

Seems we should, though AFAIK it won't help with the issue you've
pointed out (because the hotplugged device's driver already called
blk_cleanup_queue and nuked the elevator).

So I'm not sure getting the request_queue reference actually fixes
anything (maybe my imagination is lacking?). But getting the
request_queue reference is "the right thing" if we're going to be
setting pointers to it.

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 07-12-2011, 12:52 AM
Alan Stern
 
Default block: Check that queue is alive in blk_insert_cloned_request()

On Mon, 11 Jul 2011, Mike Snitzer wrote:

> [cc'ing dm-devel, vivek and tejun]
>
> On Fri, Jul 8, 2011 at 7:04 PM, Roland Dreier <roland@kernel.org> wrote:
> > From: Roland Dreier <roland@purestorage.com>
> >
> > This fixes crashes such as the below that I see when the storage
> > underlying a dm-multipath device is hot-removed. �The problem is that
> > dm requeues a request to a device whose block queue has already been
> > cleaned up, and blk_insert_cloned_request() doesn't check if the queue
> > is alive, but rather goes ahead and tries to queue the request. �This
> > ends up dereferencing the elevator that was already freed in
> > blk_cleanup_queue().
>
> Your patch looks fine to me:
> Acked-by: Mike Snitzer <snitzer@redhat.com>

There's still the issue that Stefan Richter pointed out: The test for a
dead queue must be made _after_ acquiring the queue lock, not _before_.

Alan Stern

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 07-12-2011, 01:22 AM
Mike Snitzer
 
Default block: Check that queue is alive in blk_insert_cloned_request()

On Mon, Jul 11 2011 at 8:52pm -0400,
Alan Stern <stern@rowland.harvard.edu> wrote:

> On Mon, 11 Jul 2011, Mike Snitzer wrote:
>
> > [cc'ing dm-devel, vivek and tejun]
> >
> > On Fri, Jul 8, 2011 at 7:04 PM, Roland Dreier <roland@kernel.org> wrote:
> > > From: Roland Dreier <roland@purestorage.com>
> > >
> > > This fixes crashes such as the below that I see when the storage
> > > underlying a dm-multipath device is hot-removed. ?The problem is that
> > > dm requeues a request to a device whose block queue has already been
> > > cleaned up, and blk_insert_cloned_request() doesn't check if the queue
> > > is alive, but rather goes ahead and tries to queue the request. ?This
> > > ends up dereferencing the elevator that was already freed in
> > > blk_cleanup_queue().
> >
> > Your patch looks fine to me:
> > Acked-by: Mike Snitzer <snitzer@redhat.com>
>
> There's still the issue that Stefan Richter pointed out: The test for a
> dead queue must be made _after_ acquiring the queue lock, not _before_.

Yes, quite important.

Jens, can you tweak the patch or should Roland send a v2?

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 07-12-2011, 01:46 AM
Vivek Goyal
 
Default block: Check that queue is alive in blk_insert_cloned_request()

On Mon, Jul 11, 2011 at 09:22:06PM -0400, Mike Snitzer wrote:
> On Mon, Jul 11 2011 at 8:52pm -0400,
> Alan Stern <stern@rowland.harvard.edu> wrote:
>
> > On Mon, 11 Jul 2011, Mike Snitzer wrote:
> >
> > > [cc'ing dm-devel, vivek and tejun]
> > >
> > > On Fri, Jul 8, 2011 at 7:04 PM, Roland Dreier <roland@kernel.org> wrote:
> > > > From: Roland Dreier <roland@purestorage.com>
> > > >
> > > > This fixes crashes such as the below that I see when the storage
> > > > underlying a dm-multipath device is hot-removed. ?The problem is that
> > > > dm requeues a request to a device whose block queue has already been
> > > > cleaned up, and blk_insert_cloned_request() doesn't check if the queue
> > > > is alive, but rather goes ahead and tries to queue the request. ?This
> > > > ends up dereferencing the elevator that was already freed in
> > > > blk_cleanup_queue().
> > >
> > > Your patch looks fine to me:
> > > Acked-by: Mike Snitzer <snitzer@redhat.com>
> >
> > There's still the issue that Stefan Richter pointed out: The test for a
> > dead queue must be made _after_ acquiring the queue lock, not _before_.
>
> Yes, quite important.
>
> Jens, can you tweak the patch or should Roland send a v2?

I do not think that we should do queue dead check after taking a spinlock.
The reason being that there are life time issues of two objects.

- Validity of request queue pointer
- Validity of q->spin_lock pointer

If the dm has taken the reference to the request queue in the beginning
then it can be sure request queue pointer is valid. But spin_lock might
be coming from driver and might be in one of driver allocated structures.
So it might happen that driver has called blk_cleanup_queue() and freed
up structures which contained the spin lock.

So if queue is not dead, we know that q->spin_lock is valid. I think
only race present here is that whole operation is not atomic. First
we check for queue not dead flag and then go on to acquire request
queue lock. So this leaves a small window for race. I think I have
seen other code written in such manner (__generic_make_request()). So
it proably reasonably safe to do here too.

Thanks
Vivek

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 07-12-2011, 03:24 PM
Alan Stern
 
Default block: Check that queue is alive in blk_insert_cloned_request()

On Mon, 11 Jul 2011, Vivek Goyal wrote:

> > > There's still the issue that Stefan Richter pointed out: The test for a
> > > dead queue must be made _after_ acquiring the queue lock, not _before_.
> >
> > Yes, quite important.
> >
> > Jens, can you tweak the patch or should Roland send a v2?
>
> I do not think that we should do queue dead check after taking a spinlock.
> The reason being that there are life time issues of two objects.
>
> - Validity of request queue pointer
> - Validity of q->spin_lock pointer
>
> If the dm has taken the reference to the request queue in the beginning
> then it can be sure request queue pointer is valid. But spin_lock might
> be coming from driver and might be in one of driver allocated structures.
> So it might happen that driver has called blk_cleanup_queue() and freed
> up structures which contained the spin lock.

Surely this is a bug in the design of the block layer?

> So if queue is not dead, we know that q->spin_lock is valid. I think
> only race present here is that whole operation is not atomic. First
> we check for queue not dead flag and then go on to acquire request
> queue lock. So this leaves a small window for race. I think I have
> seen other code written in such manner (__generic_make_request()). So
> it proably reasonably safe to do here too.

"Probably reasonably safe" = "unsafe". The fact that it will usually
work out okay means that when it does fail, it will be very difficult
to track down.

It needs to be fixed _now_, when people are aware of the issue. Not
five years from now, when everybody has forgotten about it.

Alan Stern

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 07-12-2011, 05:10 PM
Vivek Goyal
 
Default block: Check that queue is alive in blk_insert_cloned_request()

On Tue, Jul 12, 2011 at 11:24:54AM -0400, Alan Stern wrote:
> On Mon, 11 Jul 2011, Vivek Goyal wrote:
>
> > > > There's still the issue that Stefan Richter pointed out: The test for a
> > > > dead queue must be made _after_ acquiring the queue lock, not _before_.
> > >
> > > Yes, quite important.
> > >
> > > Jens, can you tweak the patch or should Roland send a v2?
> >
> > I do not think that we should do queue dead check after taking a spinlock.
> > The reason being that there are life time issues of two objects.
> >
> > - Validity of request queue pointer
> > - Validity of q->spin_lock pointer
> >
> > If the dm has taken the reference to the request queue in the beginning
> > then it can be sure request queue pointer is valid. But spin_lock might
> > be coming from driver and might be in one of driver allocated structures.
> > So it might happen that driver has called blk_cleanup_queue() and freed
> > up structures which contained the spin lock.
>
> Surely this is a bug in the design of the block layer?
>
> > So if queue is not dead, we know that q->spin_lock is valid. I think
> > only race present here is that whole operation is not atomic. First
> > we check for queue not dead flag and then go on to acquire request
> > queue lock. So this leaves a small window for race. I think I have
> > seen other code written in such manner (__generic_make_request()). So
> > it proably reasonably safe to do here too.
>
> "Probably reasonably safe" = "unsafe". The fact that it will usually
> work out okay means that when it does fail, it will be very difficult
> to track down.
>
> It needs to be fixed _now_, when people are aware of the issue. Not
> five years from now, when everybody has forgotten about it.

I agree that fixing would be good. Frankly speaking I don't even have
full understanding of the problem. I know little bit from request queue
side but have no idea about referencing mechanism at device level and
how that is supposed to work with request queue referencing.

So once we understand the problem well, probably we will have an answer
how to go about fixing it.

Thanks
Vivek

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 07-12-2011, 09:02 PM
Alan Stern
 
Default block: Check that queue is alive in blk_insert_cloned_request()

On Tue, 12 Jul 2011, James Bottomley wrote:

> > I think one problem point is q->queue_lock. If driver drops its reference
> > on queue and cleans up its data structures, then it will free up memory
> > associated with q->queue_lock too. (If driver provided its own queue
> > lock). In that case anything which is dependent on queue lock, needs
> > to be freed up on blk_cleanup_queue().
>
> I don't quite follow. blk_cleanup_queue() doesn't free anything (well,
> except the elevator). Final put will free the queue structure which
> contains the lock, but if it's really a final put, you have no other
> possible references, so no-one is using the lock ... well, assuming
> there isn't a programming error, of course ...
>
> > If we can make sure that request queue reference will keep the spin lock
> > alive, then i guess all cleanup part might be able to go in release
> > queue function.
>
> As I said: cleanup doesn't free the structure containing the lock,
> release does, so that piece wouldn't be altered by putting
> blk_cleanup_queue() elsewhere.

It's worth taking a closer look at what happened here. Originally
request_queue->queuedata was set to NULL and blk_cleanup_queue() was
called from scsi_device_dev_release_usercontext() (by way of
scsi_free_queue()). This caused a problem because there was a window
in which the last reference to the scsi_device had been dropped but the
release routine hadn't yet run, so the queue was still marked as active
(i.e., queuedata was still non-NULL).

Commit 86cbfb5607d4b81b1a993ff689bbd2addd5d3a9b ([SCSI] put stricter
guards on queue dead checks) was written to fix this problem. However
it moved _both_ lines from the release routine into
__scsi_remove_device(). There was no need for this; all that was
needed was to make sure queuedata was NULL before the device reference
was dropped.

And in fact, moving the call to scsi_free_queue() has opened up another
window for bugs. Now it's possible for requests to get on the queue
(submitted by drivers from another layer that still hold a reference to
the scsi_device) even after the queue's elevator has been freed. Yes,
the block layer is supposed to prevent this from happening, but there
are multiple paths and they don't all have the requisite checks.

And as Vivek points out, it's questionable whether they _can_ make the
proper checks. Checking requires the queue's lock, but the client may
have deallocated the lock at the time the queue was cleaned up. Maybe
SCSI doesn't do this, but other clients of the block layer might. The
block layer can't make any assumptions. The unavoidable conclusion is
that the clients must be responsible for insuring that no requests are
added to a queue after its lock has been released.

To help accomplish this in SCSI, I'm suggesting that the call to
scsi_free_queue() be moved back to where it used to be, in
scsi_device_dev_release_usercontext(). Then it won't be possible for
any wayward requests to be added to the queue after it is cleaned up,
because there won't be any outstanding references to the scsi_device.

Alan Stern

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 

Thread Tools




All times are GMT. The time now is 07:17 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org