Linux Archive

Linux Archive (http://www.linux-archive.org/)
-   Device-mapper Development (http://www.linux-archive.org/device-mapper-development/)
-   -   rqdm: bad usage of dm_get/dm_put (Was: dm mpath: fix stall when requeueing io) (http://www.linux-archive.org/device-mapper-development/331048-rqdm-bad-usage-dm_get-dm_put-dm-mpath-fix-stall-when-requeueing-io.html)

Kiyoshi Ueda 02-24-2010 07:16 AM

rqdm: bad usage of dm_get/dm_put (Was: dm mpath: fix stall when requeueing io)
 
Hi Mikulas, Alasdair,

Thank you for spotting this.

On 02/24/2010 04:52 AM +0900, Mikulas Patocka wrote:
> Another problem:
> dm_request_fn can be called in an interrupt context, I scanned it for
> calling process-context functions and found:
>
> It may call rq_completed (either directly, via
> dm_request_fn->map_request->dm_kill_unmapped_request->dm_complete_request
> ->dm_done->dm_end_request->dm_put) or indirectly, when the request is
> completed from host controller interrupt. And dm_put is a process_context
> function.
>
> I believe it doesn't cause a real crash, because dm_put is called in
> dm_blk_close, thus there is always at least one reference. When the device
> is closed with dm_blk_close, there should be no requests on it.
>
> But it is simply a logic error to call a process-context function from
> an interrupt context. I'd remove those dm_get/dm_put from
> request-based-dm --- they are not needed anyway, as long as there are
> requests, the "mapped_device" structure can't disappear.

Indeed, we shouldn't use the current dm_put() in any interrupt-context.
But the "mapped_device" can disappear in request-based dm while there
is a request after all bios complete, so I used dm_get()/dm_put() there.
I'll consider another way to prevent the problem without dm_get()/dm_put().
E.g. wait for request completion in dm_put() instead.


> You can apply this (in 2.6.34-rc1) to catch all the errorneous users of
> dm_put.
<snip>
> @@ -2188,6 +2188,7 @@ void dm_put(struct mapped_device *md)
> struct dm_table *map;
>
> BUG_ON(test_bit(DMF_FREEING, &md->flags));
> + might_sleep();
>
> if (atomic_dec_and_lock(&md->holders, &_minor_lock)) {
> map = dm_get_live_table(md);

The current request-based dm usually calls dm_put() from softirq context
and is warned a lot, so don't apply this patch until I fix the problem
above with another way.

Thanks,
Kiyoshi Ueda

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

Mikulas Patocka 02-24-2010 09:33 PM

rqdm: bad usage of dm_get/dm_put (Was: dm mpath: fix stall when requeueing io)
 
> Indeed, we shouldn't use the current dm_put() in any interrupt-context.
> But the "mapped_device" can disappear in request-based dm while there
> is a request after all bios complete, so I used dm_get()/dm_put() there.
> I'll consider another way to prevent the problem without dm_get()/dm_put().
> E.g. wait for request completion in dm_put() instead.

How can a request-in-progress exists when all the bios complete and the
device is closed?

Mikulas

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

Kiyoshi Ueda 02-25-2010 06:04 AM

rqdm: bad usage of dm_get/dm_put (Was: dm mpath: fix stall when requeueing io)
 
Hi Mikulas,

On 02/25/2010 07:33 AM +0900, Mikulas Patocka wrote:
>> Indeed, we shouldn't use the current dm_put() in any interrupt-context.
>> But the "mapped_device" can disappear in request-based dm while there
>> is a request after all bios complete, so I used dm_get()/dm_put() there.
>> I'll consider another way to prevent the problem without dm_get()/dm_put().
>> E.g. wait for request completion in dm_put() instead.
>
> How can a request-in-progress exists when all the bios complete and the
> device is closed?

In the current request-based dm, the device opener can remove
the mapped_device while the last request is still completing,
because bios in the last request complete first and then the device
opener can remove the mapped_device before the last request completes:
CPU0 CPU1
================================================== ====================
<<INTERRUPT>>
blk_end_request_all(clone_rq)
blk_update_request(clone_rq)
bio_endio(clone_bio) == end_clone_bio
blk_update_request(orig_rq)
bio_endio(orig_bio)
<<I/O completed>>
dm_blk_close()
dev_remove()
dm_put(md)
<<Free md>>
blk_finish_request(clone_rq)
....
dm_end_request(clone_rq)
free_rq_clone(clone_rq)
blk_end_request_all(orig_rq)
rq_completed(md)

So we need a mechanism to defer the md deletion until the last request
completes.

Thanks,
Kiyoshi Ueda

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


All times are GMT. The time now is 06:42 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.