FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Redhat > Device-mapper Development

 
 
LinkBack Thread Tools
 
Old 04-15-2010, 05:27 PM
Heinz Mauelshagen
 
Default A dm-raid45 target implemented using md raid5.

Hi Neil,

had a first go reading through your patch series w/o finding any major
issues. The only important feature for an initial release which needs
adding (as you mentioned) is (persistent) dirty log support.

Because you're using a persistent bitmap in the MD RAID personalities,
this looks like a bit more surgery to factor it out to potentially
enhance dm-log.c. For an initial solution we can as well just go with
MDs existing bitmap while keeping the dm-raid456 ctr support for
explicit dirty logging in order to avoid compatibility issues (there's
obviously no parameter to support bitmap chunk sizes so far).

Reshaping could be triggered either preferably via the constructor
involving MD metadata reads to be able to recognize the size change
requested or the message interface. Both ctr/message support could be
implemented sharing the same functions. Enhancements in the status
interface and dm_table_event() throwing on error/finish are mandatory if
we support reshaping.

A shortcoming of this MD wrapping solution vs. dm-raid45 is, that there
is no obvious way to leverage it to be a clustered RAID456 mapping
target. dm-raid45 has been designed with that future enhancement
possibility in mind.

Will try testing your code tomorrow.

Regards,
Heinz

On Thu, 2010-04-15 at 16:43 +1000, NeilBrown wrote:
> Greetings Heinz, Alasdair, and all,
> (Alasdair and Heinz cc:ed on this intro, but the patches are
> only going to the lists).
>
> Some months ago I posted a proof-of-concept patch which attempted to
> provide RAID4/5/6 functionality to 'dm' using md/raid5.c.
> While it did a least partly work it contained lots of hacks and was
> very ugly.
>
> I finally made time to do the job "properly".
>
> The following series, when applied on top of a bunch of patches I
> just submitted for linux-next, provides a 'dm-raid45' target which is
> largely compatible with the one that Heinz has written (and several
> distros are shipping), but which uses md/raid5.c for the core IO
> processing.
>
> I have tried to split the patch up into easy-to-handle pieces. You
> will note that some changes to core-dm are required, in particular to
> pass back 'congestion' information and to handle plugging (which
> raid5 uses to improve throughput). I hope the approach I have taken
> is suitable, but it can obviously be changed if necessary.
>
> The create/status/message interface differs from the one in Heinz's
> patch, but should be close enough to work with current 'dmraid'.
>
> If you want to try the patches (rather than just read them) you
> should probably "git pull" (please don't clone) from
> git://neil.brown.name/md md-dm-raid45
>
> so as to get all the prior refactoring patches in md.
>
> Some advantages of this over Heinz's patch (at least as it was
> when I last looked at it) are:
> - raid6 support
> - support for XOR-offload hardware where present
> - less code duplication
> - a single dm device can include multiple dm-raid45 targets.
> (Heinz' code accesses dm_disk(md)->queue directly which
> is a layering violations and assumes that there is no
> other target in the mapped_device).
>
> There is a lot more that could be done to this such as getting to
> work with a disk based dirty-log and making the reshape options
> available. But this patch set should provide all basic RAID5
> functionality.
>
> Would the dm community be interested in including this work upstream
> (after suitable review and testing)?
>
> Thanks,
> NeilBrown
>
>
> ---
>
> NeilBrown (12):
> md: reduce dependence on sysfs.
> md/raid5: factor out code for changing size of stripe cache.
> md/dm: create dm-raid456 module using md/raid5
> dm-raid456: add support for raising events to userspace.
> raid5: Don't set read-ahead when there is no queue
> dm-raid456: add congestion checking.
> md/raid5: add simple plugging infrastructure.
> md/plug: optionally use plugger to unplug an array during resync/recovery.
> dm-raid456: support unplug
> dm-raid456: add support for setting IO hints.
> dm-raid456: add suspend/resume method
> dm-raid456: add message handler.
>
>
> drivers/md/Kconfig | 8 +
> drivers/md/Makefile | 1
> drivers/md/dm-raid456.c | 540 +++++++++++++++++++++++++++++++++++++++++
> drivers/md/dm-table.c | 19 +
> drivers/md/md.c | 211 ++++++++++------
> drivers/md/md.h | 43 +++
> drivers/md/raid5.c | 155 +++++++-----
> drivers/md/raid5.h | 6
> include/linux/device-mapper.h | 13 +
> 9 files changed, 859 insertions(+), 137 deletions(-)
> create mode 100644 drivers/md/dm-raid456.c
>


--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 04-15-2010, 10:14 PM
Neil Brown
 
Default A dm-raid45 target implemented using md raid5.

On Thu, 15 Apr 2010 19:27:15 +0200
Heinz Mauelshagen <heinzm@redhat.com> wrote:

>
> Hi Neil,
>
> had a first go reading through your patch series w/o finding any major
> issues. The only important feature for an initial release which needs
> adding (as you mentioned) is (persistent) dirty log support.
>
> Because you're using a persistent bitmap in the MD RAID personalities,
> this looks like a bit more surgery to factor it out to potentially
> enhance dm-log.c. For an initial solution we can as well just go with
> MDs existing bitmap while keeping the dm-raid456 ctr support for
> explicit dirty logging in order to avoid compatibility issues (there's
> obviously no parameter to support bitmap chunk sizes so far).

I don't think we can use md's existing bitmap support as there is no easy way
to store it on an arbitrary target: it either lives near the metadata or on
a file (not a device).
There a just a few calls in the interface to md/bitmap.c - it shouldn't be
too hard to make those selectively call into a dm_dirty_log instead.
I want to do something like that anyway as I want to optionally be able to use
a dirty log which is a list of dirty sector addresses rather than a bitmap.
I'll have a look next week.

And the "bitmap chunk size" is exactly the same as the dm "region size".
(which would probably have been a better name to choose for md too).

>
> Reshaping could be triggered either preferably via the constructor
> involving MD metadata reads to be able to recognize the size change
> requested or the message interface. Both ctr/message support could be
> implemented sharing the same functions. Enhancements in the status
> interface and dm_table_event() throwing on error/finish are mandatory if
> we support reshaping.

I imagine enhancing the constructor to take before/after values for
type, disks, chunksize, and a sector which marks where "after" starts.
You also need to know which direction the reshape is going (low addresses to
high addresses, or the reverse) though that might be implicit in the other
values.

>
> A shortcoming of this MD wrapping solution vs. dm-raid45 is, that there
> is no obvious way to leverage it to be a clustered RAID456 mapping
> target. dm-raid45 has been designed with that future enhancement
> possibility in mind.
>

I haven't given cluster locking a lot of thought...
I would probably do the locking on a per-"stripe_head" basis as everything
revolves around that structure.
Get a shared lock when servicing a read (Which would only happen on a
degraded array - normally reads bypass the stripe cache), or a write lock
when servicing a write or a resync.
It should all interface with DLM quite well - when DLM tries to reclaim a lock
we first mark all the stripe as not up-to-date...

Does DM simply use DLM for locking or something else?


> Will try testing your code tomorrow.

Thanks,

NeilBrown

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 04-16-2010, 09:27 AM
Heinz Mauelshagen
 
Default A dm-raid45 target implemented using md raid5.

On Fri, 2010-04-16 at 08:14 +1000, Neil Brown wrote:
> On Thu, 15 Apr 2010 19:27:15 +0200
> Heinz Mauelshagen <heinzm@redhat.com> wrote:
>
> >
> > Hi Neil,
> >
> > had a first go reading through your patch series w/o finding any major
> > issues. The only important feature for an initial release which needs
> > adding (as you mentioned) is (persistent) dirty log support.
> >
> > Because you're using a persistent bitmap in the MD RAID personalities,
> > this looks like a bit more surgery to factor it out to potentially
> > enhance dm-log.c. For an initial solution we can as well just go with
> > MDs existing bitmap while keeping the dm-raid456 ctr support for
> > explicit dirty logging in order to avoid compatibility issues (there's
> > obviously no parameter to support bitmap chunk sizes so far).
>
> I don't think we can use md's existing bitmap support as there is no easy way
> to store it on an arbitrary target: it either lives near the metadata or on
> a file (not a device).
> There a just a few calls in the interface to md/bitmap.c - it shouldn't be
> too hard to make those selectively call into a dm_dirty_log instead.

Good, it was my thinking if using dm-dirty-log interface, that there are
some MD bitmap code valuables we could factor out (bitmap flushing
enhancements?).

>
> I want to do something like that anyway as I want to optionally be able to use
> a dirty log which is a list of dirty sector addresses rather than a bitmap.
> I'll have a look next week.

Ok.

>
> And the "bitmap chunk size" is exactly the same as the dm "region size".
> (which would probably have been a better name to choose for md too).

Fair enough.

>
> >
> > Reshaping could be triggered either preferably via the constructor
> > involving MD metadata reads to be able to recognize the size change
> > requested or the message interface. Both ctr/message support could be
> > implemented sharing the same functions. Enhancements in the status
> > interface and dm_table_event() throwing on error/finish are mandatory if
> > we support reshaping.
>
> I imagine enhancing the constructor to take before/after values for
> type, disks, chunksize, and a sector which marks where "after" starts.
> You also need to know which direction the reshape is going (low addresses to
> high addresses, or the reverse) though that might be implicit in the other
> values.

Yes, that can be additional ctr variable parameters allowing for a
compatible enhancement.

One possibility could be using variable parameters from free #8 on:

o to_raid_type # may be existing one; eg. raid6_zr
o to_chunk_size # new requested chunk size in sectors
o old_size # actual size of the array
o low_to_high/high_to_low # low->high or high->low addresses

ti->len defines the new intended size while old_size provides the actual
size of the array.

>
> >
> > A shortcoming of this MD wrapping solution vs. dm-raid45 is, that there
> > is no obvious way to leverage it to be a clustered RAID456 mapping
> > target. dm-raid45 has been designed with that future enhancement
> > possibility in mind.
> >
>
> I haven't given cluster locking a lot of thought...
> I would probably do the locking on a per-"stripe_head" basis as everything
> revolves around that structure.

Makes sense. I was also thinking about tying stripe invalidation to lock
state changes.

> Get a shared lock when servicing a read (Which would only happen on a
> degraded array - normally reads bypass the stripe cache), or a write lock
> when servicing a write or a resync.

Yes, an exclusive DLM lock.

> It should all interface with DLM quite well - when DLM tries to reclaim a lock
> we first mark all the stripe as not up-to-date...

When a dm-raid45(6) instance tries to reclaim either lock *after* it had
to drop it before, it has to invalidate the respective stripe date.

>
> Does DM simply use DLM for locking or something else?

We don't use the DLM from DM yet, but essentially: yes, you'ld call
dlm_new_lockspace(), dlm_lock(..., DLM_LOCK_{CR|EX}, ...), ...

Of course such locking has to be abstracted in dm-raid456 in order to
plug in NULL, clustered, locking modules.

Cheers,
Heinz

>
>
> > Will try testing your code tomorrow.
>
> Thanks,
>
> NeilBrown
>
> --
> dm-devel mailing list
> dm-devel@redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel


--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 

Thread Tools




All times are GMT. The time now is 05:37 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org