FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor


 
 
LinkBack Thread Tools
 
Old 03-14-2012, 06:04 PM
chetan loke
 
Default Bcache

On Wed, Mar 14, 2012 at 2:41 PM, Kent Overstreet <koverstreet@google.com> wrote:
> If you want me to implement bcache differently, shouldn't you explain why

relax. I explained it already but you are defensive about your code.
flash-cache works, period. And I may be wrong but it is GPL'd. If perf
is an issue then atleast let everyone know how it can be improved
rather than saying my way or the highway. aren't you saying in your
patches - support for thin prov etc. but if dm provides it then why
are you duplicating code?

> why? I'm not sure why I _have_ to justify my decisions to you.

Others might want to contribute to it and not just consume it. This
ain't your local sandbox. So it's quite common to get such questions
when you are trying to add new functionality. May be I missed some of
your emails. If so point me to them.

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 03-14-2012, 06:22 PM
chetan loke
 
Default Bcache

On Wed, Mar 14, 2012 at 2:54 PM, Ted Ts'o <tytso@mit.edu> wrote:
> On Wed, Mar 14, 2012 at 02:33:25PM -0400, chetan loke wrote:
>> But you are not explaining why dm is not the right stack. Just because
>> it crashed when you tried doesn't mean it's not the right place.
>> flash-cache works, doesn't it? flash-cache's limitation is because
>> it's a dm-target or because it is using hashing or something else?
>> There are start-ups who are doing quite great with SSD-cache+dm. So
>> please stop kidding yourself.
>
> SATA-attached flash is not the only kind of flash out there you know.
> There is also PCIe-attached flash which is a wee bit faster (where wee
> is defined as multiple orders of magnitude --- SATA-attached SSD's
> typically have thousands of IOPS; Fusion I/O is shipping product today
> with hundreds of thousands of IOPS, and has demonstrated a billion
> IOPS early this year). *And Fusion I/O isn't the only company shipping
> PCIe-attached flash products.
>

We've designed linux targets with million IOPS even before PCIe-flash
came into picture you know. So, I think we do know a thing or two
about million IOPS and performance. when I said 'cache' I used it
loosely. The backing store can be anything - a SSD or PCI-e or
adjacent blade over IB.


> Startups may be doing great on SSD's; you may want to accept the fact
> that there is stuff which is way, way, way better out there than
> SSD's which are available on the market *today*.
>
> And it's not like bache which is a new project. *It's working code,
> just like flash cache is today. *So it's not like it needs to justify
> its existence.
>

we are talking about approaches and not existence.

> Best regards,
>
> * * * * * * * * * * * * * * * * * * * *- Ted

BR,
Chetan

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 03-14-2012, 09:01 PM
Mike Snitzer
 
Default Bcache

On Wed, Mar 14 2012 at 1:24pm -0400,
Kent Overstreet <koverstreet@google.com> wrote:

> On Wed, Mar 14, 2012 at 11:53 AM, Vivek Goyal <vgoyal@redhat.com> wrote:
> > On Wed, Mar 14, 2012 at 09:32:28AM -0400, Kent Overstreet wrote:
> >> I'm already registered to attend, but would it be too late in the
> >> process to give a talk? I'd like to give a short talk about bcache, what
> >> it does and where it's going (more than just caching).
> >
> > [CCing dm-devel list]
> >
> > I am curious if you considered writing a device mapper driver for this? If
> > yes, why that is not a good choice. It seems to be stacked device and device
> > mapper should be good at that. All the configuration through sysfs seems
> > little odd to me.
>
> Everyone asks this. Yeah, I considered it, I tried to make it work for
> a couple weeks but it was far more trouble than it was worth. I'm not
> opposed to someone else working on it but I'm not going to spend any
> more time on it myself.

I really wish you'd have worked with dm-devel more persistently, you did
post twice to dm-devel (at an awkward time of year but whatever):
http://www.redhat.com/archives/dm-devel/2010-December/msg00204.html
http://www.redhat.com/archives/dm-devel/2010-December/msg00232.html

But somewhere along the way you privately gave up on DM... and have
since repeatedly talked critically of DM. Yet you have _never_
substantiated _why_ DM is "far more trouble than it was worth", etc.

Reading between the lines on a previous LKML bcache threads where the
questions of "why not use DM or MD?" came up:
https://lkml.org/lkml/2011/9/11/117
https://lkml.org/lkml/2011/9/15/376

It seemed your primary focus was on getting into the details of the SSD
caching ASAP -- because that is what interested you. Both DM and MD
have a learning curve, maybe it was too frustrating and/or
distracting to tackle.

Anyway, I don't fault you for initially doing your own thing for a
virtual device framework -- it allowed you to get to the stuff you
really cared about sooner.

That said, it is frustrating that you are content to continue doing your
own thing because I'm now tasked with implementing a DM target for
caching/HSM, as I touched on here:
http://www.redhat.com/archives/linux-lvm/2012-March/msg00007.html

I have little upfront incentive to make use of bcache because it doesn't
use DM. Not to mention DM already has its own b-tree implementation
(granted bcache is much more than it's b+tree). I obviously won't
ignore bcache (or flashcache) but I'm setting out to build on DM
infrastructure as effectively as possible.

My initial take on how to factor things is to split into 2 DM targets:
"hsm-cache" and "hsm". These targets reuse the infrastructure that was
recently introduced for dm-thinp: drivers/md/persistent-data/ and
dm-bufio.

Like the "thin-pool" target, the "hsm-cache" target provides a central
resource (cache) that "hsm" target device(s) will attach to. The
"hsm-cache" target, like thin-pool, will have a data and metadata
device, constructor:
hsm-cache <metadata dev> <data dev> <data block size (sectors)>

The "hsm" target will pair an hsm-cache device with a backing device,
constructor:
hsm <dev_id> <cache_dev> <backing_dev>

The same hsm-cache device may be used by multiple hsm devices. So I
mean this is the same high-level architecture as bcache (shared SSD
cache).

Where things get interesting is the mechanics of the caching and the
metadata. I'm coming to terms with the metadata now (based on desired
features and cache replacement policies), once it is nailed down I
expect things to fall into place pretty quickly.

I'm very early in the design but hope to have an initial functional
version of the code together in time for LSF -- ~2 weeks may be too
ambitious but it's my goal (could be more doable if I confine the
initial code to writethrough with LRU).

Mike

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 03-14-2012, 09:09 PM
"Williams, Dan J"
 
Default Bcache

On Wed, Mar 14, 2012 at 3:01 PM, Mike Snitzer <snitzer@redhat.com> wrote:
> I'm very early in the design but hope to have an initial functional
> version of the code together in time for LSF -- ~2 weeks may be too
> ambitious but it's my goal (could be more doable if I confine the
> initial code to writethrough with LRU).

I'm hoping caching ends up being as successful as the raid456
unification where we can have a dm or md interface in front of some
common infrastructure. The inertia for md is to keep it close to all
the recent software raid advancements, the inertia for dm is also
clear, the inertia for something brand new... not very clear.

--
Dan

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 03-15-2012, 04:01 PM
Kent Overstreet
 
Default Bcache

On Wed, Mar 14, 2012 at 03:04:52PM -0400, chetan loke wrote:
> On Wed, Mar 14, 2012 at 2:41 PM, Kent Overstreet <koverstreet@google.com> wrote:
> > If you want me to implement bcache differently, shouldn't you explain why
>
> relax. I explained it already but you are defensive about your code.
> flash-cache works, period. And I may be wrong but it is GPL'd. If perf
> is an issue then atleast let everyone know how it can be improved
> rather than saying my way or the highway. aren't you saying in your
> patches - support for thin prov etc. but if dm provides it then why
> are you duplicating code?

I'm not defensive about my code; you asked why someone would be
interested in bcache vs. flash cache, and performance is the most
obvious reason. Seems kind of ridiculous to then accuse me of being
defensive.

If you want to know how the performance of flash cache can be improved,
bcache's design is documented and the code is available. I'm not
interested in flash cache and improving it isn't my job; furthermore
bcache's performance comes from fundamental design decisions so I don't
think flash cache is ever going to approach bcache's performance.

> > why? I'm not sure why I _have_ to justify my decisions to you.
>
> Others might want to contribute to it and not just consume it. This
> ain't your local sandbox. So it's quite common to get such questions
> when you are trying to add new functionality. May be I missed some of
> your emails. If so point me to them.

Helping others get involved is rather different - I'm perfectly fine to
help anyone who's interested, and I spent quite a lot of time
documenting and explaining the code, and helping users out.

But I'm just not interested in justifying bcache's existince vs.
flashcache. If you like flashcache better, it's no skin off my back.

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 03-15-2012, 04:02 PM
Kent Overstreet
 
Default Bcache

On Wed, Mar 14, 2012 at 02:54:56PM -0400, Ted Ts'o wrote:
> On Wed, Mar 14, 2012 at 02:33:25PM -0400, chetan loke wrote:
> > But you are not explaining why dm is not the right stack. Just because
> > it crashed when you tried doesn't mean it's not the right place.
> > flash-cache works, doesn't it? flash-cache's limitation is because
> > it's a dm-target or because it is using hashing or something else?
> > There are start-ups who are doing quite great with SSD-cache+dm. So
> > please stop kidding yourself.
>
> SATA-attached flash is not the only kind of flash out there you know.
> There is also PCIe-attached flash which is a wee bit faster (where wee
> is defined as multiple orders of magnitude --- SATA-attached SSD's
> typically have thousands of IOPS; Fusion I/O is shipping product today
> with hundreds of thousands of IOPS, and has demonstrated a billion
> IOPS early this year). And Fusion I/O isn't the only company shipping
> PCIe-attached flash products.
>
> Startups may be doing great on SSD's; you may want to accept the fact
> that there is stuff which is way, way, way better out there than
> SSD's which are available on the market *today*.
>
> And it's not like bache which is a new project. It's working code,
> just like flash cache is today. So it's not like it needs to justify
> its existence.
>
> Best regards,
>
> - Ted

Thanks Ted, as usual you word things rather less abrasively than me

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 03-15-2012, 04:27 PM
Kent Overstreet
 
Default Bcache

On Wed, Mar 14, 2012 at 06:01:50PM -0400, Mike Snitzer wrote:
> On Wed, Mar 14 2012 at 1:24pm -0400,
> Kent Overstreet <koverstreet@google.com> wrote:
>
> > On Wed, Mar 14, 2012 at 11:53 AM, Vivek Goyal <vgoyal@redhat.com> wrote:
> > > On Wed, Mar 14, 2012 at 09:32:28AM -0400, Kent Overstreet wrote:
> > >> I'm already registered to attend, but would it be too late in the
> > >> process to give a talk? I'd like to give a short talk about bcache, what
> > >> it does and where it's going (more than just caching).
> > >
> > > [CCing dm-devel list]
> > >
> > > I am curious if you considered writing a device mapper driver for this? If
> > > yes, why that is not a good choice. It seems to be stacked device and device
> > > mapper should be good at that. All the configuration through sysfs seems
> > > little odd to me.
> >
> > Everyone asks this. Yeah, I considered it, I tried to make it work for
> > a couple weeks but it was far more trouble than it was worth. I'm not
> > opposed to someone else working on it but I'm not going to spend any
> > more time on it myself.
>
> I really wish you'd have worked with dm-devel more persistently, you did
> post twice to dm-devel (at an awkward time of year but whatever):
> http://www.redhat.com/archives/dm-devel/2010-December/msg00204.html
> http://www.redhat.com/archives/dm-devel/2010-December/msg00232.html

I spent quite a bit of time talking to Heinz Mauelshagen and someone
else who's name escapes me; I also spent around two weeks working on
bcache-dm code before I decided it was unworkable.

And bcache is two years old now, if the dm guys wanted bcache to use dm
there's been ample opportunity; nobody's been interested enough to do
anything about it. I'm still not against a bcache-dm interface, if
someone else can make it work - I just really have no interest or reason
to write the code myself. It works fine as it is.

> But somewhere along the way you privately gave up on DM... and have
> since repeatedly talked critically of DM. Yet you have _never_
> substantiated _why_ DM is "far more trouble than it was worth", etc.

I have, can't blame you for missing it but honestly this comes up
constantly; people asking me (often accusitavely) why bcache doesn't use
dm and it gets really old. I've got better things to do.

Frankly, my biggest complaint with the DM is that the code is _terrible_
and very poorly documented. It's an inflexible framework that tries to
combine a bunch of things that should be orthogonal. My other complaints
all stem from that; it became very clear that it wasn't designed for
creating a block device from the kernel, which is kind of necessary (at
least the only sane way of doing it, IMO) when metadata is managed by
the kernel (and the kernel has to manage most metadata for bcache).

> Reading between the lines on a previous LKML bcache threads where the
> questions of "why not use DM or MD?" came up:
> https://lkml.org/lkml/2011/9/11/117
> https://lkml.org/lkml/2011/9/15/376
>
> It seemed your primary focus was on getting into the details of the SSD
> caching ASAP -- because that is what interested you. Both DM and MD
> have a learning curve, maybe it was too frustrating and/or
> distracting to tackle.
>
> Anyway, I don't fault you for initially doing your own thing for a
> virtual device framework -- it allowed you to get to the stuff you
> really cared about sooner.
>
> That said, it is frustrating that you are content to continue doing your
> own thing because I'm now tasked with implementing a DM target for
> caching/HSM, as I touched on here:
> http://www.redhat.com/archives/linux-lvm/2012-March/msg00007.html

Kind of presumptuous, don't you think?

I've nothing at all against collaborating, or you or other dm devs
adapting bcache code - I'd help out with that!

But I'm just not going to write my code a certain way just to suit you.

> I have little upfront incentive to make use of bcache because it doesn't
> use DM. Not to mention DM already has its own b-tree implementation
> (granted bcache is much more than it's b+tree). I obviously won't
> ignore bcache (or flashcache) but I'm setting out to build on DM
> infrastructure as effectively as possible.

Oh, darn.

> My initial take on how to factor things is to split into 2 DM targets:
> "hsm-cache" and "hsm". These targets reuse the infrastructure that was
> recently introduced for dm-thinp: drivers/md/persistent-data/ and
> dm-bufio.
>
> Like the "thin-pool" target, the "hsm-cache" target provides a central
> resource (cache) that "hsm" target device(s) will attach to. The
> "hsm-cache" target, like thin-pool, will have a data and metadata
> device, constructor:
> hsm-cache <metadata dev> <data dev> <data block size (sectors)>
>
> The "hsm" target will pair an hsm-cache device with a backing device,
> constructor:
> hsm <dev_id> <cache_dev> <backing_dev>
>
> The same hsm-cache device may be used by multiple hsm devices. So I
> mean this is the same high-level architecture as bcache (shared SSD
> cache).
>
> Where things get interesting is the mechanics of the caching and the
> metadata. I'm coming to terms with the metadata now (based on desired
> features and cache replacement policies), once it is nailed down I
> expect things to fall into place pretty quickly.
>
> I'm very early in the design but hope to have an initial functional
> version of the code together in time for LSF -- ~2 weeks may be too
> ambitious but it's my goal (could be more doable if I confine the
> initial code to writethrough with LRU).

Look forward to seeing the benchmarks.

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 03-15-2012, 06:43 PM
Vivek Goyal
 
Default Bcache

On Wed, Mar 14, 2012 at 01:24:08PM -0400, Kent Overstreet wrote:

[..]
>
> Can you post the full log? There was a bug where if it encountered an
> error during registration, it wouldn't wait for a uuid read or write
> before tearing everything down - that's what your backtrace looks like
> to me.
>
> You could try the bcache-3.2-dev branch, too. I have a newer branch
> with a ton of bugfixes but I'm waiting until it's seen more testing
> before I post it.

Faced the same issue on bcache-3.2-dev branch too.

login: [ 167.532932] bio: create slab <bio-1> at 1
[ 167.539071] bcache: invalidating existing data
[ 167.547604] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
[ 167.548573] CPU 2
[ 167.548573] Modules linked in: floppy [last unloaded: scsi_wait_scan]
[ 167.548573]
[ 167.548573] Pid: 0, comm: swapper/2 Not tainted 3.2.0-bcache+ #4
Hewlett-Packard HP xw6600 Workstation/0A9Ch
[ 167.548573] RIP: 0010:[<ffffffff8144d6fe>] [<ffffffff8144d6fe>]
closure_put+0xe/0x20
[ 167.548573] RSP: 0018:ffff88013fc83c60 EFLAGS: 00010246
[ 167.548573] RAX: 0000000000000000 RBX: ffff8801385b04a0 RCX:
0000000000000000
[ 167.548573] RDX: 0000000000000000 RSI: 00000000ffffffff RDI:
6b6b6b6b6b6b6b6b
[ 167.548573] RBP: ffff88013fc83c60 R08: 0000000000000000 R09:
0000000000000001
[ 167.548573] R10: 0000000000000000 R11: 0000000000000000 R12:
0000000000000000
[ 167.548573] R13: ffff880137719580 R14: 0000000000080000 R15:
0000000000000000
[ 167.548573] FS: 0000000000000000(0000) GS:ffff88013fc80000(0000)
knlGS:0000000000000000
[ 167.548573] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 167.548573] CR2: 00007f6e84f70240 CR3: 000000013707d000 CR4:
00000000000006e0
[ 167.548573] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[ 167.548573] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400
[ 167.548573] Process swapper/2 (pid: 0, threadinfo ffff88013a454000,
task ffff88013a458000)
[ 167.548573] Stack:
[ 167.548573] ffff88013fc83c80 ffffffff814448c6 ffffffff00000000
ffff8801385b04a0
[ 167.548573] ffff88013fc83c90 ffffffff8117ae8d ffff88013fc83cc0
ffffffff812e2273
[ 167.548573] ffff88013a454000 0000000000000000 ffff8801385b04a0
0000000000080000
[ 167.548573] Call Trace:
[ 167.548573] <IRQ>
[ 167.548573] [<ffffffff814448c6>] uuid_endio+0x36/0x40
[ 167.548573] [<ffffffff8117ae8d>] bio_endio+0x1d/0x40
[ 167.548573] [<ffffffff812e2273>] req_bio_endio+0x83/0xc0
[ 167.548573] [<ffffffff812e53e1>] blk_update_request+0x101/0x5c0
[ 167.548573] [<ffffffff812e5612>] ? blk_update_request+0x332/0x5c0
[ 167.548573] [<ffffffff812e58d1>] blk_update_bidi_request+0x31/0x90
[ 167.548573] [<ffffffff812e595c>] blk_end_bidi_request+0x2c/0x80
[ 167.548573] [<ffffffff812e59f0>] blk_end_request+0x10/0x20
[ 167.548573] [<ffffffff81458fdc>] scsi_io_completion+0x9c/0x5f0
[ 167.548573] [<ffffffff8144fcd0>] scsi_finish_command+0xb0/0xe0
[ 167.548573] [<ffffffff81458dc5>] scsi_softirq_done+0xa5/0x140
[ 167.548573] [<ffffffff812ec55b>] blk_done_softirq+0x7b/0x90
[ 167.548573] [<ffffffff810512ae>] __do_softirq+0xce/0x3c0
[ 167.548573] [<ffffffff817e84ac>] call_softirq+0x1c/0x30
[ 167.548573] [<ffffffff8100417d>] do_softirq+0x8d/0xc0
[ 167.548573] [<ffffffff810518de>] irq_exit+0xae/0xe0
[ 167.548573] [<ffffffff817e8bb3>] do_IRQ+0x63/0xe0
[ 167.548573] [<ffffffff817de1f0>] common_interrupt+0x70/0x70
[ 167.548573] <EOI>
[ 167.548573] [<ffffffff8100a5f6>] ? mwait_idle+0xb6/0x490
[ 167.548573] [<ffffffff8100a5ed>] ? mwait_idle+0xad/0x490
[ 167.548573] [<ffffffff810011e6>] cpu_idle+0x96/0xe0
[ 167.548573] [<ffffffff817cb475>] start_secondary+0x1be/0x1c2
[ 167.548573] Code: ee 01 00 00 10 e8 03 ff ff ff 48 85 db 75 de 5b 41 5c
5d c3 66 0f 1f 84 00 00 00 00 00 55 48 89 e5 66 66 66 66 90 be ff ff ff ff
<f0> 0f c1 77 48 83 ee 01 e8 d5 fe ff ff 5d c3 0f 1f 00 55 48 89
[ 167.548573] RIP [<ffffffff8144d6fe>] closure_put+0xe/0x20
[ 167.548573] RSP <ffff88013fc83c60>

Thanks
Vivek

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 03-15-2012, 07:17 PM
Mike Snitzer
 
Default Bcache

On Thu, Mar 15 2012 at 1:27pm -0400,
Kent Overstreet <koverstreet@google.com> wrote:

> On Wed, Mar 14, 2012 at 06:01:50PM -0400, Mike Snitzer wrote:
> > I really wish you'd have worked with dm-devel more persistently, you did
> > post twice to dm-devel (at an awkward time of year but whatever):
> > http://www.redhat.com/archives/dm-devel/2010-December/msg00204.html
> > http://www.redhat.com/archives/dm-devel/2010-December/msg00232.html
>
> I spent quite a bit of time talking to Heinz Mauelshagen and someone
> else who's name escapes me; I also spent around two weeks working on
> bcache-dm code before I decided it was unworkable.
>
> And bcache is two years old now, if the dm guys wanted bcache to use dm
> there's been ample opportunity; nobody's been interested enough to do
> anything about it. I'm still not against a bcache-dm interface, if
> someone else can make it work - I just really have no interest or reason
> to write the code myself. It works fine as it is.

Your interest should be in getting the hard work you've put into bcache
upstream. That's unlikely to happen until you soften on your reluctance
to embrace existing appropriate kernel interfaces.

> Frankly, my biggest complaint with the DM is that the code is _terrible_
> and very poorly documented. It's an inflexible framework that tries to
> combine a bunch of things that should be orthogonal. My other complaints
> all stem from that; it became very clear that it wasn't designed for
> creating a block device from the kernel, which is kind of necessary (at
> least the only sane way of doing it, IMO) when metadata is managed by
> the kernel (and the kernel has to manage most metadata for bcache).

Baseless and unspecific assertions don't help your cause -- dm-thinp
disproves your unconvincing position (manages it's metadata in kernel,
etc).

Seems pretty clear you could care less about _really_ working together
-- maybe it is just this DM/kernel interface thing gets you down.

Regardless, the burden is on me (and all developers who have a desire to
see a caching/HSM driver get upstream) to evaluate bcache. That process
has started -- hopefully it'll be as simple as:

1) put a DM target wrapper in place of your sysfs interface.
2) switch/port bcache's btree over to drivers/md/persistent-data/
3) dm-bcache FTW

One could dream.

The little bit I've looked at bcache it already seems unrealistic; for
starters you have the btree wired directly to bio submission.
drivers/md/persistent-data/ offers a layered approach,
dm-block-manager.c brokers the IO submission (via dm-bufio) so the
management of the btree(s) doesn't need to be concerned with actual IO.

bcache is _very_ tightly coupled with your btree implementation.

> > Reading between the lines on a previous LKML bcache threads where the
> > questions of "why not use DM or MD?" came up:
> > https://lkml.org/lkml/2011/9/11/117
> > https://lkml.org/lkml/2011/9/15/376
> >
> > It seemed your primary focus was on getting into the details of the SSD
> > caching ASAP -- because that is what interested you. Both DM and MD
> > have a learning curve, maybe it was too frustrating and/or
> > distracting to tackle.
> >
> > Anyway, I don't fault you for initially doing your own thing for a
> > virtual device framework -- it allowed you to get to the stuff you
> > really cared about sooner.
> >
> > That said, it is frustrating that you are content to continue doing your
> > own thing because I'm now tasked with implementing a DM target for
> > caching/HSM, as I touched on here:
> > http://www.redhat.com/archives/linux-lvm/2012-March/msg00007.html
>
> Kind of presumptuous, don't you think?

Not really, considering what I'm responding to at the moment

> I've nothing at all against collaborating, or you or other dm devs
> adapting bcache code - I'd help out with that!

OK.

> But I'm just not going to write my code a certain way just to suit you.

upstream kumbaya: more cooperative eyes on the problem, working to hook
into established interfaces, will produce a solution that is worthy of
upstream inclusion.

> Look forward to seeing the benchmarks.

Speaking of which, weren't you saying you'd show bcache benchmarks in a
previous LKML thread?

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 03-15-2012, 09:59 PM
Kent Overstreet
 
Default Bcache

On Thu, Mar 15, 2012 at 04:17:32PM -0400, Mike Snitzer wrote:
> Your interest should be in getting the hard work you've put into bcache
> upstream. That's unlikely to happen until you soften on your reluctance
> to embrace existing appropriate kernel interfaces.

I don't really care what you think my priorities should be. I write code
first and foremost for myself, and the one thing I care about is good
code.

I'd love to have bcache in mainline, seeing more use and getting more
improvements - but if that's contingent on making it work through dm,
sorry, not interested.

If you want to convince me that dm is the right way to go you'll have
much better luck with technical arguments.

Besides which, I'm planning on (and very soon going to be working on)
growing bcache down into an FTL and up into the bottom half of a
filesystem. As far as I can tell integrating with dm would only get in
the way of that.

It's actually not as crazy as it sounds - the basic idea is to make the
index the central abstraction, and allocation policies sit conceptually
underneath and are abstracted out - and sitting top, some filesystem
code (and possibly other things) uses the existing code as if it were
some kind of object storage like thing; the existing bcache code maps
inode numberffset -> lba instead of cached deviceffset.

I'll explain more at LSF, but eventually it ought to look vaguely like
btrfs/zfs but with better abstraction and better performance.

> > Frankly, my biggest complaint with the DM is that the code is _terrible_
> > and very poorly documented. It's an inflexible framework that tries to
> > combine a bunch of things that should be orthogonal. My other complaints
> > all stem from that; it became very clear that it wasn't designed for
> > creating a block device from the kernel, which is kind of necessary (at
> > least the only sane way of doing it, IMO) when metadata is managed by
> > the kernel (and the kernel has to manage most metadata for bcache).
>
> Baseless and unspecific assertions don't help your cause -- dm-thinp
> disproves your unconvincing position (manages it's metadata in kernel,
> etc).

I'm not the only one who's read the dm code and found it lacking - and
anyways, I'm not really out to convince anyone.

> Seems pretty clear you could care less about _really_ working together
> -- maybe it is just this DM/kernel interface thing gets you down.

Dude, I reached out to dm developers ages ago. Maybe if you guys had
shown some interest we wouldn't be having this conversation now.

This finger pointing is ridiculous and getting us nowhere.

> Regardless, the burden is on me (and all developers who have a desire to
> see a caching/HSM driver get upstream) to evaluate bcache. That process
> has started -- hopefully it'll be as simple as:
>
> 1) put a DM target wrapper in place of your sysfs interface.
> 2) switch/port bcache's btree over to drivers/md/persistent-data/
> 3) dm-bcache FTW

Replacing bcache's persistent metadata code? Hah. That's the central
part of the design!

Is this the way new filesystems are evaluated? No, it's not. What makes
you more special than ext4?

> One could dream.
>
> The little bit I've looked at bcache it already seems unrealistic; for
> starters you have the btree wired directly to bio submission.
> drivers/md/persistent-data/ offers a layered approach,
> dm-block-manager.c brokers the IO submission (via dm-bufio) so the
> management of the btree(s) doesn't need to be concerned with actual IO.
>
> bcache is _very_ tightly coupled with your btree implementation.

Yes, it is! It really has to be, efficiently allocating buckets and
invalidating cached data relies on specific details of the btree
implementation.

The btree is _central_ to bcache, ignoring that the rest of the code
isn't all that interesting.

> > > That said, it is frustrating that you are content to continue doing your
> > > own thing because I'm now tasked with implementing a DM target for
> > > caching/HSM, as I touched on here:
> > > http://www.redhat.com/archives/linux-lvm/2012-March/msg00007.html
> >
> > Kind of presumptuous, don't you think?
>
> Not really, considering what I'm responding to at the moment

Maybe you should consider how you word things...

> > I've nothing at all against collaborating, or you or other dm devs
> > adapting bcache code - I'd help out with that!
>
> OK.
>
> > But I'm just not going to write my code a certain way just to suit you.
>
> upstream kumbaya: more cooperative eyes on the problem, working to hook
> into established interfaces, will produce a solution that is worthy of
> upstream inclusion.

Let me be clear: All I care about is the best solution. I'm more than
happy to work with other people to achieve that, but I don't give a damn
about anything else.

> > Look forward to seeing the benchmarks.
>
> Speaking of which, weren't you saying you'd show bcache benchmarks in a
> previous LKML thread?

Yeah I did, but as usual I got distracted. I'm travelling for the next
three weeks, but maybe I can get someone else to get some numbers that
we can publish...

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 

Thread Tools




All times are GMT. The time now is 09:46 PM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org