FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Redhat > Device-mapper Development

 
 
LinkBack Thread Tools
 
Old 05-16-2012, 10:19 PM
Mike Snitzer
 
Default dm thin: commit pool's metadata on last close of thin device

Reinstate dm_flush_all and dm_table_flush_all. dm_blk_close will
now trigger the .flush method of all targets within a table on the last
close of a DM device.

In the case of the thin target, the thin_flush method will commit the
backing pool's metadata.

Doing so avoids a deadlock that has been observed with the following
sequence (as can be triggered via "dmsetup remove_all"):
- IO is issued to a thin device, thin device is closed
- pool's metadata device is suspended before the pool is
- because the pool still has outstanding IO we deadlock because the
pool's metadata device is suspended

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org
---
drivers/md/dm-table.c | 9 +++++++++
drivers/md/dm-thin.c | 19 +++++++++++++++++++
drivers/md/dm.c | 20 +++++++++++++++++++-
drivers/md/dm.h | 1 +
4 files changed, 48 insertions(+), 1 deletions(-)

diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
index 2e227fb..077fff8 100644
--- a/drivers/md/dm-table.c
+++ b/drivers/md/dm-table.c
@@ -1487,6 +1487,15 @@ int dm_table_resume_targets(struct dm_table *t)
return 0;
}

+void dm_table_flush_all(struct dm_table *t)
+{
+ unsigned i;
+
+ for (i = 0; i < t->num_targets; i++)
+ if (t->targets[i].type->flush)
+ t->targets[i].type->flush(&t->targets[i]);
+}
+
void dm_table_add_target_callbacks(struct dm_table *t, struct dm_target_callbacks *cb)
{
list_add(&cb->list, &t->target_callbacks);
diff --git a/drivers/md/dm-thin.c b/drivers/md/dm-thin.c
index c514078..f64c7e6 100644
--- a/drivers/md/dm-thin.c
+++ b/drivers/md/dm-thin.c
@@ -2429,6 +2429,24 @@ static void thin_io_hints(struct dm_target *ti, struct queue_limits *limits)
set_discard_limits(pool, limits);
}

+static void thin_flush(struct dm_target *ti)
+{
+ int r;
+ struct thin_c *tc = ti->private;
+ struct pool *pool = tc->pool;
+
+ /*
+ * A bit heavy-handed but the only existing way to batch
+ * metadata commits is to issue() a FLUSH bio -- but DM
+ * doesn't allocate bios outside the DM core.
+ */
+ r = dm_pool_commit_metadata(pool->pmd);
+ if (r < 0) {
+ DMERR("%s: dm_pool_commit_metadata() failed, error = %d",
+ __func__, r);
+ }
+}
+
static struct target_type thin_target = {
.name = "thin",
.version = {1, 1, 0},
@@ -2441,6 +2459,7 @@ static struct target_type thin_target = {
.status = thin_status,
.iterate_devices = thin_iterate_devices,
.io_hints = thin_io_hints,
+ .flush = thin_flush,
};

/*----------------------------------------------------------------*/
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 23a1a84..715ee57 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -487,6 +487,16 @@ out:
return md ? 0 : -ENXIO;
}

+static void dm_flush_all(struct mapped_device *md)
+{
+ struct dm_table *t = dm_get_live_table(md);
+
+ if (t) {
+ dm_table_flush_all(t);
+ dm_table_put(t);
+ }
+}
+
static int dm_blk_close(struct gendisk *disk, fmode_t mode)
{
struct mapped_device *md = disk->private_data;
@@ -494,10 +504,17 @@ static int dm_blk_close(struct gendisk *disk, fmode_t mode)
spin_lock(&_minor_lock);

atomic_dec(&md->open_count);
- dm_put(md);

spin_unlock(&_minor_lock);

+ /*
+ * Flush all targets on last close
+ */
+ if (!dm_open_count(md))
+ dm_flush_all(md);
+
+ dm_put(md);
+
return 0;
}

@@ -2468,6 +2485,7 @@ void dm_destroy_immediate(struct mapped_device *md)
void dm_put(struct mapped_device *md)
{
atomic_dec(&md->holders);
+ smp_mb__after_atomic_dec();
}
EXPORT_SYMBOL_GPL(dm_put);

diff --git a/drivers/md/dm.h b/drivers/md/dm.h
index b7dacd5..82199a1 100644
--- a/drivers/md/dm.h
+++ b/drivers/md/dm.h
@@ -66,6 +66,7 @@ bool dm_table_supports_discards(struct dm_table *t);
int dm_table_alloc_md_mempools(struct dm_table *t);
void dm_table_free_md_mempools(struct dm_table *t);
struct dm_md_mempools *dm_table_get_md_mempools(struct dm_table *t);
+void dm_table_flush_all(struct dm_table *t);

int dm_queue_merge_is_compulsory(struct request_queue *q);

--
1.7.1

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 05-17-2012, 04:00 AM
Mike Snitzer
 
Default dm thin: commit pool's metadata on last close of thin device

On Wed, May 16 2012 at 8:43pm -0400,
Mikulas Patocka <mpatocka@redhat.com> wrote:

>
>
> On Wed, 16 May 2012, Mike Snitzer wrote:
>
> > Reinstate dm_flush_all and dm_table_flush_all. dm_blk_close will
> > now trigger the .flush method of all targets within a table on the last
> > close of a DM device.
> >
> > In the case of the thin target, the thin_flush method will commit the
> > backing pool's metadata.
> >
> > Doing so avoids a deadlock that has been observed with the following
> > sequence (as can be triggered via "dmsetup remove_all"):
> > - IO is issued to a thin device, thin device is closed
> > - pool's metadata device is suspended before the pool is
> > - because the pool still has outstanding IO we deadlock because the
> > pool's metadata device is suspended
> >
> > Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> > Cc: stable@vger.kernel.org
>
> I'd say --- don't do this sequence.
>
> Device mapper generally expects that devices are suspended top-down ---
> i.e. you should first suspend the thin device and then suspend its
> underlying data and metadata device. If you violate this sequence and
> suspend bottom-up, you get deadlocks.
>
> For example, if dm-mirror is resynchronizing and you suspend the
> underlying leg or log volume and then suspend dm-mirror, you get a
> deadlock.
>
> If dm-snapshot is merging and you suspend the underlying snapshot or
> origin volume and then suspend snapshot-merget target, you get a deadlock.
>
> These are not bugs in dm-mirror or dm-snapshot, this is expected behavior.
> Userspace shouldn't do any bottom-up suspend sequence.
>
> In the same sense, if you suspend the underlying data or metadata pool and
> then suspend dm-thin, you get a deadlock too. Fix userspace so that it
> doesn't do it.

Yeah, I agree. I told Zdenek it'd be best to just not do it.

'dmsetup remove_all' is a dumb command that invites these deadlocks when
more sophisticated stacking is being used.

But all said, in general I don't have a problem with triggering a target
specific "flush" on device close.

(Though the implementation of thin_flush could be made more
intelligent... as is we can see a pretty good storm of redundant
metadata commits if 100s of thin devices are closed simultaneously --
creating pmd->root_lock write lock contention).

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 05-17-2012, 07:44 AM
Zdenek Kabelac
 
Default dm thin: commit pool's metadata on last close of thin device

Dne 17.5.2012 06:00, Mike Snitzer napsal(a):

On Wed, May 16 2012 at 8:43pm -0400,
Mikulas Patocka<mpatocka@redhat.com> wrote:




On Wed, 16 May 2012, Mike Snitzer wrote:


Reinstate dm_flush_all and dm_table_flush_all. dm_blk_close will
now trigger the .flush method of all targets within a table on the last
close of a DM device.

In the case of the thin target, the thin_flush method will commit the
backing pool's metadata.

Doing so avoids a deadlock that has been observed with the following
sequence (as can be triggered via "dmsetup remove_all"):
- IO is issued to a thin device, thin device is closed
- pool's metadata device is suspended before the pool is
- because the pool still has outstanding IO we deadlock because the
pool's metadata device is suspended

Signed-off-by: Mike Snitzer<snitzer@redhat.com>
Cc: stable@vger.kernel.org


I'd say --- don't do this sequence.

Device mapper generally expects that devices are suspended top-down ---
i.e. you should first suspend the thin device and then suspend its
underlying data and metadata device. If you violate this sequence and
suspend bottom-up, you get deadlocks.

For example, if dm-mirror is resynchronizing and you suspend the
underlying leg or log volume and then suspend dm-mirror, you get a
deadlock.

If dm-snapshot is merging and you suspend the underlying snapshot or
origin volume and then suspend snapshot-merget target, you get a deadlock.

These are not bugs in dm-mirror or dm-snapshot, this is expected behavior.
Userspace shouldn't do any bottom-up suspend sequence.

In the same sense, if you suspend the underlying data or metadata pool and
then suspend dm-thin, you get a deadlock too. Fix userspace so that it
doesn't do it.


Yeah, I agree. I told Zdenek it'd be best to just not do it.

'dmsetup remove_all' is a dumb command that invites these deadlocks when
more sophisticated stacking is being used.

But all said, in general I don't have a problem with triggering a target
specific "flush" on device close.

(Though the implementation of thin_flush could be made more
intelligent... as is we can see a pretty good storm of redundant
metadata commits if 100s of thin devices are closed simultaneously --
creating pmd->root_lock write lock contention).



I'd say it differently - the question here is whether we really want to
support forced removal of devices or we keep them stuck in the system.


The system must expect that device become unavailable anytime (hw fault),
and if the device is not mounted and unused, it should not be a problem to
remove such device (even suspended).

However current code basically removes thinpool target (has open count 0),
but keeps data and metadata devices (the problem is worse, if the metadata
device is replaced with error target in this moment).


Also not - there is not a problem in userspace code as such (except the case
of discardless devices - where the change is simple quite unexpected new
behavior of device table, thus older tools cannot work properly with it).

The error is technically in the 'teardown' code for the test case - which
used to assume that something with open count 0 could be easily removed - and
unremovable targets might be replaced with error targets (so underlying devs
could be detached as well) - however with current thinp target, we are in

some troubles and I'd like to see them behaving like all other dm targets.

Zdenek

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 

Thread Tools




All times are GMT. The time now is 08:43 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org