FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Redhat > Device-mapper Development

 
 
LinkBack Thread Tools
 
Old 03-19-2012, 08:45 PM
Mikulas Patocka
 
Default dm-table: delayed cleanup for dm_table_destroy()

Hi Hannes

I wouldn't do this. This behavior actually existed in kernels <= 2.6.28.
And it caused trouble.

The problems are these:

Some code may hold a spinlock and call dm_table_put. Currently,
dm_table_put just decrements the counter and the cleanup is done
synchronously in dm_table_destroy. With your patch, cleanup would be done
in dm_table_put --- you call a target destructor (which may sleep) in a
non-sleeping context.

Some code may hold a mutex and call dm_table_put. If you call a target
destructor from dm_table_put and it takes the same mutex, it deadlocks.

Currently, when dm_table_destroy exits, it is guaranteed that the table is
destroyed and all target destructors have been called. With your patch,
dm_table_destroy may exit and the table could be still alive because of
some other reference. This may cause leaked references to some other files
or devices. For example, suppose that the user has a device mapper device
"dm" that redirects requests to "/dev/sda". The user assumes that if he
runs "dmsetup remove dm" and this command returns, "/dev/sda" will not be
open. Your patch breaks this assumption.

Look at the commit "d58168763f74d1edbc296d7038c60efe6493fdd4" in 2.6.29
--- I was actually removing the old-style reference counting (that is
functionally equivalent to what your patch does) because of these three
problems. The old code really caused a crash although it happened very
rarely.

Mikulas


On Mon, 19 Mar 2012, Hannes Reinecke wrote:

> We should be using a kref instead of the existing holders flag.
> This enables us to use delayed cleanup and we'll get rid of the
> msleep in dm_table_destroy().
>
> Signed-off-by: Hannes Reinecke <hare@suse.de>
> ---
> drivers/md/dm-table.c | 26 +++++++++++++-------------
> 1 files changed, 13 insertions(+), 13 deletions(-)
>
> diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
> index a3d1e18..97c01f7 100644
> --- a/drivers/md/dm-table.c
> +++ b/drivers/md/dm-table.c
> @@ -41,7 +41,7 @@
>
> struct dm_table {
> struct mapped_device *md;
> - atomic_t holders;
> + struct kref kref;
> unsigned type;
>
> /* btree table */
> @@ -208,7 +208,7 @@ int dm_table_create(struct dm_table **result, fmode_t mode,
>
> INIT_LIST_HEAD(&t->devices);
> INIT_LIST_HEAD(&t->target_callbacks);
> - atomic_set(&t->holders, 0);
> + kref_init(&t->kref);
>
> if (!num_targets)
> num_targets = KEYS_PER_NODE;
> @@ -240,17 +240,11 @@ static void free_devices(struct list_head *devices)
> }
> }
>
> -void dm_table_destroy(struct dm_table *t)
> +void __table_destroy(struct kref *kref)
> {
> + struct dm_table *t = container_of(kref, struct dm_table, kref);
> unsigned int i;
>
> - if (!t)
> - return;
> -
> - while (atomic_read(&t->holders))
> - msleep(1);
> - smp_mb();
> -
> /* free the indexes */
> if (t->depth >= 2)
> vfree(t->index[t->depth - 2]);
> @@ -277,7 +271,8 @@ void dm_table_destroy(struct dm_table *t)
>
> void dm_table_get(struct dm_table *t)
> {
> - atomic_inc(&t->holders);
> + if (t)
> + kref_get(&t->kref);
> }
> EXPORT_SYMBOL(dm_table_get);
>
> @@ -286,11 +281,16 @@ void dm_table_put(struct dm_table *t)
> if (!t)
> return;
>
> - smp_mb__before_atomic_dec();
> - atomic_dec(&t->holders);
> + kref_put(&t->kref, __table_destroy);
> }
> EXPORT_SYMBOL(dm_table_put);
>
> +void dm_table_destroy(struct dm_table *t)
> +{
> + smp_mb__before_atomic_dec();
> + dm_table_put(t);
> +}
> +
> /*
> * Checks to see if we need to extend highs or targets.
> */
> --
> 1.6.0.2
>
> --
> dm-devel mailing list
> dm-devel@redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel
>

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 03-20-2012, 02:35 PM
Hannes Reinecke
 
Default dm-table: delayed cleanup for dm_table_destroy()

Hi Mikulas,

On 03/19/2012 10:45 PM, Mikulas Patocka wrote:
> Hi Hannes
>
> I wouldn't do this. This behavior actually existed in kernels <= 2.6.28.
> And it caused trouble.
>
> The problems are these:
>
> Some code may hold a spinlock and call dm_table_put. Currently,
> dm_table_put just decrements the counter and the cleanup is done
> synchronously in dm_table_destroy. With your patch, cleanup would be done
> in dm_table_put --- you call a target destructor (which may sleep) in a
> non-sleeping context.
>
> Some code may hold a mutex and call dm_table_put. If you call a target
> destructor from dm_table_put and it takes the same mutex, it deadlocks.
>
> Currently, when dm_table_destroy exits, it is guaranteed that the table is
> destroyed and all target destructors have been called. With your patch,
> dm_table_destroy may exit and the table could be still alive because of
> some other reference. This may cause leaked references to some other files
> or devices. For example, suppose that the user has a device mapper device
> "dm" that redirects requests to "/dev/sda". The user assumes that if he
> runs "dmsetup remove dm" and this command returns, "/dev/sda" will not be
> open. Your patch breaks this assumption.
>
Hmm. Yes, true.

> Look at the commit "d58168763f74d1edbc296d7038c60efe6493fdd4" in 2.6.29
> --- I was actually removing the old-style reference counting (that is
> functionally equivalent to what your patch does) because of these three
> problems. The old code really caused a crash although it happened very
> rarely.
>
Ah. That'll explain it.
The actual problem I'm trying to track down is that I'm seeing an
excessive duration for the 'resume' dm ioctl after a table update.
I've got reports where the ioctl can take up to several seconds.
Which (as this is multipath) causes an extremely erratic behaviour.

And the 'msleep' here is one of the obvious culprits.

But I'll continue debugging, maybe I'll find something else.

Cheers,

Hannes
--
Dr. Hannes Reinecke zSeries & Storage
hare@suse.de +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 03-21-2012, 12:24 AM
Mikulas Patocka
 
Default dm-table: delayed cleanup for dm_table_destroy()

On Tue, 20 Mar 2012, Hannes Reinecke wrote:

> Ah. That'll explain it.
> The actual problem I'm trying to track down is that I'm seeing an
> excessive duration for the 'resume' dm ioctl after a table update.
> I've got reports where the ioctl can take up to several seconds.
> Which (as this is multipath) causes an extremely erratic behaviour.
>
> And the 'msleep' here is one of the obvious culprits.
>
> But I'll continue debugging, maybe I'll find something else.
>
> Cheers,
>
> Hannes

If it takes several second, it is not obviously problem with msleep(1) ---
if we converted msleep(1) to a wait queue, it would improve by at most
1ms, but it couldn't improve by a few seconds.

Mikulas

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 03-21-2012, 04:34 AM
"Jun'ichi Nomura"
 
Default dm-table: delayed cleanup for dm_table_destroy()

Hi Hannes,

On 03/21/12 00:35, Hannes Reinecke wrote:
> The actual problem I'm trying to track down is that I'm seeing an
> excessive duration for the 'resume' dm ioctl after a table update.
> I've got reports where the ioctl can take up to several seconds.
> Which (as this is multipath) causes an extremely erratic behaviour.
>
> And the 'msleep' here is one of the obvious culprits.
>
> But I'll continue debugging, maybe I'll find something else.

Did you track down which part of the resume ioctl took long?

If table is updated and the device is not yet suspended,
resume ioctl itself suspends the device.
And dm_suspend() could take long depending on lower devices
as it waits for already-submitted I/Os to return to dm.

Thanks,
--
Jun'ichi Nomura, NEC Corporation

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 

Thread Tools




All times are GMT. The time now is 10:20 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright ©2007 - 2008, www.linux-archive.org