FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Redhat > Device-mapper Development

 
 
LinkBack Thread Tools
 
Old 08-31-2012, 01:43 AM
Kent Overstreet
 
Default block: Avoid deadlocks with bio allocation by stacking drivers

On Thu, Aug 30, 2012 at 06:07:45PM -0400, Vivek Goyal wrote:
> On Wed, Aug 29, 2012 at 10:13:45AM -0700, Kent Overstreet wrote:
>
> [..]
> > > Performance aside, punting submission to per device worker in case of deep
> > > stack usage sounds cleaner solution to me.
> >
> > Agreed, but performance tends to matter in the real world. And either
> > way the tricky bits are going to be confined to a few functions, so I
> > don't think it matters that much.
> >
> > If someone wants to code up the workqueue version and test it, they're
> > more than welcome...
>
> Here is one quick and dirty proof of concept patch. It checks for stack
> depth and if remaining space is less than 20% of stack size, then it
> defers the bio submission to per queue worker.

I can't think of any correctness issues. I see some stuff that could be
simplified (blk_drain_deferred_bios() is redundant, just make it a
wrapper around blk_deffered_bio_work()).

Still skeptical about the performance impact, though - frankly, on some
of the hardware I've been running bcache on this would be a visible
performance regression - probably double digit percentages but I'd have
to benchmark it. That kind of of hardware/usage is not normal today,
but I've put a lot of work into performance and I don't want to make
things worse without good reason.

Have you tested/benchmarked it?

There's scheduling behaviour, too. We really want the workqueue thread's
cpu time to be charged to the process that submitted the bio. (We could
use a mechanism like that in other places, too... not like this is a new
issue).

This is going to be a real issue for users that need strong isolation -
for any driver that uses non negligable cpu (i.e. dm crypt), we're
breaking that (not that it wasn't broken already, but this makes it
worse).

I could be convinced, but right now I prefer my solution.

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 08-31-2012, 01:55 AM
Kent Overstreet
 
Default block: Avoid deadlocks with bio allocation by stacking drivers

On Thu, Aug 30, 2012 at 6:43 PM, Kent Overstreet <koverstreet@google.com> wrote:
> On Thu, Aug 30, 2012 at 06:07:45PM -0400, Vivek Goyal wrote:
>> On Wed, Aug 29, 2012 at 10:13:45AM -0700, Kent Overstreet wrote:
>>
>> [..]
>> > > Performance aside, punting submission to per device worker in case of deep
>> > > stack usage sounds cleaner solution to me.
>> >
>> > Agreed, but performance tends to matter in the real world. And either
>> > way the tricky bits are going to be confined to a few functions, so I
>> > don't think it matters that much.
>> >
>> > If someone wants to code up the workqueue version and test it, they're
>> > more than welcome...
>>
>> Here is one quick and dirty proof of concept patch. It checks for stack
>> depth and if remaining space is less than 20% of stack size, then it
>> defers the bio submission to per queue worker.
>
> I can't think of any correctness issues. I see some stuff that could be
> simplified (blk_drain_deferred_bios() is redundant, just make it a
> wrapper around blk_deffered_bio_work()).
>
> Still skeptical about the performance impact, though - frankly, on some
> of the hardware I've been running bcache on this would be a visible
> performance regression - probably double digit percentages but I'd have
> to benchmark it. That kind of of hardware/usage is not normal today,
> but I've put a lot of work into performance and I don't want to make
> things worse without good reason.

Here's another crazy idea - we don't really need another thread, just
more stack space.

We could check if we're running out of stack space, then if we are
just allocate another two pages and memcpy the struct thread_info
over.

I think the main obstacle is that we'd need some per arch code for
mucking with the stack pointer. And it'd break backtraces, but that's
fixable.

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 08-31-2012, 03:01 PM
Vivek Goyal
 
Default block: Avoid deadlocks with bio allocation by stacking drivers

On Thu, Aug 30, 2012 at 06:43:59PM -0700, Kent Overstreet wrote:
> On Thu, Aug 30, 2012 at 06:07:45PM -0400, Vivek Goyal wrote:
> > On Wed, Aug 29, 2012 at 10:13:45AM -0700, Kent Overstreet wrote:
> >
> > [..]
> > > > Performance aside, punting submission to per device worker in case of deep
> > > > stack usage sounds cleaner solution to me.
> > >
> > > Agreed, but performance tends to matter in the real world. And either
> > > way the tricky bits are going to be confined to a few functions, so I
> > > don't think it matters that much.
> > >
> > > If someone wants to code up the workqueue version and test it, they're
> > > more than welcome...
> >
> > Here is one quick and dirty proof of concept patch. It checks for stack
> > depth and if remaining space is less than 20% of stack size, then it
> > defers the bio submission to per queue worker.
>
> I can't think of any correctness issues. I see some stuff that could be
> simplified (blk_drain_deferred_bios() is redundant, just make it a
> wrapper around blk_deffered_bio_work()).
>
> Still skeptical about the performance impact, though - frankly, on some
> of the hardware I've been running bcache on this would be a visible
> performance regression - probably double digit percentages but I'd have
> to benchmark it. That kind of of hardware/usage is not normal today,
> but I've put a lot of work into performance and I don't want to make
> things worse without good reason.

Would you like to give this patch a quick try and see with bcache on your
hardware how much performance impact do you see.

Given the fact that submission through worker happens only in case of
when stack usage is high, that should reduce the impact of the patch
and common use cases should reamin unaffected.

>
> Have you tested/benchmarked it?

No, I have not. I will run some simple workloads on SSD.

>
> There's scheduling behaviour, too. We really want the workqueue thread's
> cpu time to be charged to the process that submitted the bio. (We could
> use a mechanism like that in other places, too... not like this is a new
> issue).
>
> This is going to be a real issue for users that need strong isolation -
> for any driver that uses non negligable cpu (i.e. dm crypt), we're
> breaking that (not that it wasn't broken already, but this makes it
> worse).

There are so many places in kernel where worker threads do work on behalf
of each process. I think this is really a minor concern and I would not
be too worried about it.

What is concerning though really is the greater stack usage due to
recursive nature of make_request() and performance impact of deferral
to a worker thread.

Thanks
Vivek

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 09-01-2012, 02:13 AM
Tejun Heo
 
Default block: Avoid deadlocks with bio allocation by stacking drivers

Hello, Vivek.

On Thu, Aug 30, 2012 at 06:07:45PM -0400, Vivek Goyal wrote:
> Here is one quick and dirty proof of concept patch. It checks for stack
> depth and if remaining space is less than 20% of stack size, then it
> defers the bio submission to per queue worker.

So, it removes breadth-first walking of bio construction by ensuring
stack overflow never happens by bouncing to workqueue if stack usage
seems too high.

I do like removal of breadth-first walking. It makes failure
scenarios a lot less mind-bending. That said, Kent is right that this
can incur significant overhead for certain configurations, and looking
at stack usage in block layer is rather nasty both in design and
implementation.

If we're gonna need rescuer anyway and can get it right and the
mechanism can be contained in block proper relatively well, I think it
would be better to make bread-first walking safe. Both are nasty in
their own ways after all.

Thanks.

--
tejun

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 09-03-2012, 12:49 AM
Dave Chinner
 
Default block: Avoid deadlocks with bio allocation by stacking drivers

On Thu, Aug 30, 2012 at 06:07:45PM -0400, Vivek Goyal wrote:
> On Wed, Aug 29, 2012 at 10:13:45AM -0700, Kent Overstreet wrote:
>
> [..]
> > > Performance aside, punting submission to per device worker in case of deep
> > > stack usage sounds cleaner solution to me.
> >
> > Agreed, but performance tends to matter in the real world. And either
> > way the tricky bits are going to be confined to a few functions, so I
> > don't think it matters that much.
> >
> > If someone wants to code up the workqueue version and test it, they're
> > more than welcome...
>
> Here is one quick and dirty proof of concept patch. It checks for stack
> depth and if remaining space is less than 20% of stack size, then it
> defers the bio submission to per queue worker.

Given that we are working around stack depth issues in the
filesystems already in several places, and now it seems like there's
a reason to work around it in the block layers as well, shouldn't we
simply increase the default stack size rather than introduce
complexity and performance regressions to try and work around not
having enough stack?

I mean, we can deal with it like the ia32 4k stack issue was dealt
with (i.e. ignore those stupid XFS people, that's an XFS bug), or
we can face the reality that storage stacks have become so complex
that 8k is no longer a big enough stack for a modern system....

Cheers,

Dave.
--
Dave Chinner
david@fromorbit.com

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 09-03-2012, 01:17 AM
Kent Overstreet
 
Default block: Avoid deadlocks with bio allocation by stacking drivers

On Mon, Sep 03, 2012 at 10:49:27AM +1000, Dave Chinner wrote:
> Given that we are working around stack depth issues in the
> filesystems already in several places, and now it seems like there's
> a reason to work around it in the block layers as well, shouldn't we
> simply increase the default stack size rather than introduce
> complexity and performance regressions to try and work around not
> having enough stack?
>
> I mean, we can deal with it like the ia32 4k stack issue was dealt
> with (i.e. ignore those stupid XFS people, that's an XFS bug), or
> we can face the reality that storage stacks have become so complex
> that 8k is no longer a big enough stack for a modern system....

I'm not arguing against increasing the default stack size (I really
don't have an opinion there) - but it's not a solution for the block
layer, as stacking block devices can require an unbounded amount of
stack without the generic_make_request() convert recursion-to-iteration
thing.

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 09-03-2012, 01:26 AM
Kent Overstreet
 
Default block: Avoid deadlocks with bio allocation by stacking drivers

On Fri, Aug 31, 2012 at 11:01:59AM -0400, Vivek Goyal wrote:
> On Thu, Aug 30, 2012 at 06:43:59PM -0700, Kent Overstreet wrote:
> > On Thu, Aug 30, 2012 at 06:07:45PM -0400, Vivek Goyal wrote:
> > > On Wed, Aug 29, 2012 at 10:13:45AM -0700, Kent Overstreet wrote:
> > >
> > > [..]
> > > > > Performance aside, punting submission to per device worker in case of deep
> > > > > stack usage sounds cleaner solution to me.
> > > >
> > > > Agreed, but performance tends to matter in the real world. And either
> > > > way the tricky bits are going to be confined to a few functions, so I
> > > > don't think it matters that much.
> > > >
> > > > If someone wants to code up the workqueue version and test it, they're
> > > > more than welcome...
> > >
> > > Here is one quick and dirty proof of concept patch. It checks for stack
> > > depth and if remaining space is less than 20% of stack size, then it
> > > defers the bio submission to per queue worker.
> >
> > I can't think of any correctness issues. I see some stuff that could be
> > simplified (blk_drain_deferred_bios() is redundant, just make it a
> > wrapper around blk_deffered_bio_work()).
> >
> > Still skeptical about the performance impact, though - frankly, on some
> > of the hardware I've been running bcache on this would be a visible
> > performance regression - probably double digit percentages but I'd have
> > to benchmark it. That kind of of hardware/usage is not normal today,
> > but I've put a lot of work into performance and I don't want to make
> > things worse without good reason.
>
> Would you like to give this patch a quick try and see with bcache on your
> hardware how much performance impact do you see.

If I can get a test system I can publish numbers setup with a modern
kernel, on I will. Will take a bit though.

> Given the fact that submission through worker happens only in case of
> when stack usage is high, that should reduce the impact of the patch
> and common use cases should reamin unaffected.

Except depending on how users have their systems configured, it'll
either never happen or it'll happen for most every bio. That makes the
performance overhead unpredictable, too.

> >
> > Have you tested/benchmarked it?
>
> No, I have not. I will run some simple workloads on SSD.

Normal SATA ssds are not going to show the overhead - achi is a pig
and it'll be lost in the noise.

> There are so many places in kernel where worker threads do work on behalf
> of each process. I think this is really a minor concern and I would not
> be too worried about it.

Yeah, but this is somewhat unprecedented in the amount of cpu time
you're potentially moving to worker threads.

It is a concern.

> What is concerning though really is the greater stack usage due to
> recursive nature of make_request() and performance impact of deferral
> to a worker thread.

Your patch shouldn't increase stack usage (at least if your threshold is
safe - it's too high as is).

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 09-03-2012, 01:34 AM
Kent Overstreet
 
Default block: Avoid deadlocks with bio allocation by stacking drivers

On Fri, Aug 31, 2012 at 07:13:48PM -0700, Tejun Heo wrote:
> Hello, Vivek.
>
> On Thu, Aug 30, 2012 at 06:07:45PM -0400, Vivek Goyal wrote:
> > Here is one quick and dirty proof of concept patch. It checks for stack
> > depth and if remaining space is less than 20% of stack size, then it
> > defers the bio submission to per queue worker.
>
> So, it removes breadth-first walking of bio construction by ensuring
> stack overflow never happens by bouncing to workqueue if stack usage
> seems too high.
>
> I do like removal of breadth-first walking. It makes failure
> scenarios a lot less mind-bending. That said, Kent is right that this
> can incur significant overhead for certain configurations, and looking
> at stack usage in block layer is rather nasty both in design and
> implementation.
>
> If we're gonna need rescuer anyway and can get it right and the
> mechanism can be contained in block proper relatively well, I think it
> would be better to make bread-first walking safe. Both are nasty in
> their own ways after all.

I added that filtering I was talking about, and I like this version much
better.

To me at least, it's much clearer what it's actually doing; when we go
sleep in an allocation, we first unblock only the bios that were
allocated from this bio_set - i.e. only the bios that caused the
original deadlock.

It's still trickier than Vivek's approach but the performance impact
certainly lowers, since we're only using the workqueue thread on
allocation failure.

commit c61f9c16dc8c7ae833a73b857936106c71daab3f
Author: Kent Overstreet <koverstreet@google.com>
Date: Fri Aug 31 20:52:41 2012 -0700

block: Avoid deadlocks with bio allocation by stacking drivers

Previously, if we ever try to allocate more than once from the same bio
set while running under generic_make_request() (i.e. a stacking block
driver), we risk deadlock.

This is because of the code in generic_make_request() that converts
recursion to iteration; any bios we submit won't actually be submitted
(so they can complete and eventually be freed) until after we return -
this means if we allocate a second bio, we're blocking the first one
from ever being freed.

Thus if enough threads call into a stacking block driver at the same
time with bios that need multiple splits, and the bio_set's reserve gets
used up, we deadlock.

This can be worked around in the driver code - we could check if we're
running under generic_make_request(), then mask out __GFP_WAIT when we
go to allocate a bio, and if the allocation fails punt to workqueue and
retry the allocation.

But this is tricky and not a generic solution. This patch solves it for
all users by inverting the previously described technique. We allocate a
rescuer workqueue for each bio_set, and then in the allocation code if
there are bios on current->bio_list we would be blocking, we punt them
to the rescuer workqueue to be submitted.

Tested it by forcing the rescue codepath to be taken (by disabling the
first GFP_NOWAIT) attempt, and then ran it with bcache (which does a lot
of arbitrary bio splitting) and verified that the rescuer was being
invoked.

Signed-off-by: Kent Overstreet <koverstreet@google.com>
CC: Jens Axboe <axboe@kernel.dk>

diff --git a/fs/bio.c b/fs/bio.c
index 22d654f..076751f 100644
--- a/fs/bio.c
+++ b/fs/bio.c
@@ -286,6 +286,43 @@ void bio_reset(struct bio *bio)
}
EXPORT_SYMBOL(bio_reset);

+static void bio_alloc_rescue(struct work_struct *work)
+{
+ struct bio_set *bs = container_of(work, struct bio_set, rescue_work);
+ struct bio *bio;
+
+ while (1) {
+ spin_lock(&bs->rescue_lock);
+ bio = bio_list_pop(&bs->rescue_list);
+ spin_unlock(&bs->rescue_lock);
+
+ if (!bio)
+ break;
+
+ generic_make_request(bio);
+ }
+}
+
+static void punt_bios_to_rescuer(struct bio_set *bs)
+{
+ struct bio_list punt, nopunt;
+ struct bio *bio;
+
+ bio_list_init(&punt);
+ bio_list_init(&nopunt);
+
+ while ((bio = bio_list_pop(current->bio_list)))
+ bio_list_add(bio->bi_pool == bs ? &punt : &nopunt, bio);
+
+ *current->bio_list = nopunt;
+
+ spin_lock(&bs->rescue_lock);
+ bio_list_merge(&bs->rescue_list, &punt);
+ spin_unlock(&bs->rescue_lock);
+
+ queue_work(bs->rescue_workqueue, &bs->rescue_work);
+}
+
/**
* bio_alloc_bioset - allocate a bio for I/O
* @gfp_mask: the GFP_ mask given to the slab allocator
@@ -308,6 +345,7 @@ EXPORT_SYMBOL(bio_reset);
*/
struct bio *bio_alloc_bioset(gfp_t gfp_mask, int nr_iovecs, struct bio_set *bs)
{
+ gfp_t saved_gfp = gfp_mask;
unsigned front_pad;
unsigned inline_vecs;
unsigned long idx = BIO_POOL_NONE;
@@ -325,13 +363,37 @@ struct bio *bio_alloc_bioset(gfp_t gfp_mask, int nr_iovecs, struct bio_set *bs)
front_pad = 0;
inline_vecs = nr_iovecs;
} else {
+ /*
+ * generic_make_request() converts recursion to iteration; this
+ * means if we're running beneath it, any bios we allocate and
+ * submit will not be submitted (and thus freed) until after we
+ * return.
+ *
+ * This exposes us to a potential deadlock if we allocate
+ * multiple bios from the same bio_set() while running
+ * underneath generic_make_request(). If we were to allocate
+ * multiple bios (say a stacking block driver that was splitting
+ * bios), we would deadlock if we exhausted the mempool's
+ * reserve.
+ *
+ * We solve this, and guarantee forward progress, with a rescuer
+ * workqueue per bio_set. If we go to allocate and there are
+ * bios on current->bio_list, we first try the allocation
+ * without __GFP_WAIT; if that fails, we punt those bios we
+ * would be blocking to the rescuer workqueue before we retry
+ * with the original gfp_flags.
+ */
+
+ if (current->bio_list && !bio_list_empty(current->bio_list))
+ gfp_mask &= ~__GFP_WAIT;
+retry:
p = mempool_alloc(bs->bio_pool, gfp_mask);
front_pad = bs->front_pad;
inline_vecs = BIO_INLINE_VECS;
}

if (unlikely(!p))
- return NULL;
+ goto err;

bio = p + front_pad;
bio_init(bio);
@@ -352,6 +414,13 @@ struct bio *bio_alloc_bioset(gfp_t gfp_mask, int nr_iovecs, struct bio_set *bs)

err_free:
mempool_free(p, bs->bio_pool);
+err:
+ if (gfp_mask != saved_gfp) {
+ punt_bios_to_rescuer(bs);
+ gfp_mask = saved_gfp;
+ goto retry;
+ }
+
return NULL;
}
EXPORT_SYMBOL(bio_alloc_bioset);
@@ -1561,6 +1630,9 @@ static void biovec_free_pools(struct bio_set *bs)

void bioset_free(struct bio_set *bs)
{
+ if (bs->rescue_workqueue)
+ destroy_workqueue(bs->rescue_workqueue);
+
if (bs->bio_pool)
mempool_destroy(bs->bio_pool);

@@ -1596,6 +1668,10 @@ struct bio_set *bioset_create(unsigned int pool_size, unsigned int front_pad)

bs->front_pad = front_pad;

+ spin_lock_init(&bs->rescue_lock);
+ bio_list_init(&bs->rescue_list);
+ INIT_WORK(&bs->rescue_work, bio_alloc_rescue);
+
bs->bio_slab = bio_find_or_create_slab(front_pad + back_pad);
if (!bs->bio_slab) {
kfree(bs);
@@ -1606,9 +1682,14 @@ struct bio_set *bioset_create(unsigned int pool_size, unsigned int front_pad)
if (!bs->bio_pool)
goto bad;

- if (!biovec_create_pools(bs, pool_size))
- return bs;
+ if (biovec_create_pools(bs, pool_size))
+ goto bad;
+
+ bs->rescue_workqueue = alloc_workqueue("bioset", WQ_MEM_RECLAIM, 0);
+ if (!bs->rescue_workqueue)
+ goto bad;

+ return bs;
bad:
bioset_free(bs);
return NULL;
diff --git a/include/linux/bio.h b/include/linux/bio.h
index a7561b9..f329102 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -491,6 +491,15 @@ struct bio_set {
mempool_t *bio_integrity_pool;
#endif
mempool_t *bvec_pool;
+
+ /*
+ * Deadlock avoidance for stacking block drivers: see comments in
+ * bio_alloc_bioset() for details
+ */
+ spinlock_t rescue_lock;
+ struct bio_list rescue_list;
+ struct work_struct rescue_work;
+ struct workqueue_struct *rescue_workqueue;
};

struct biovec_slab {

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 09-03-2012, 08:41 PM
Mikulas Patocka
 
Default block: Avoid deadlocks with bio allocation by stacking drivers

On Thu, 30 Aug 2012, Kent Overstreet wrote:

> On Thu, Aug 30, 2012 at 06:07:45PM -0400, Vivek Goyal wrote:
> > On Wed, Aug 29, 2012 at 10:13:45AM -0700, Kent Overstreet wrote:
> >
> > [..]
> > > > Performance aside, punting submission to per device worker in case of deep
> > > > stack usage sounds cleaner solution to me.
> > >
> > > Agreed, but performance tends to matter in the real world. And either
> > > way the tricky bits are going to be confined to a few functions, so I
> > > don't think it matters that much.
> > >
> > > If someone wants to code up the workqueue version and test it, they're
> > > more than welcome...
> >
> > Here is one quick and dirty proof of concept patch. It checks for stack
> > depth and if remaining space is less than 20% of stack size, then it
> > defers the bio submission to per queue worker.
>
> I can't think of any correctness issues. I see some stuff that could be
> simplified (blk_drain_deferred_bios() is redundant, just make it a
> wrapper around blk_deffered_bio_work()).
>
> Still skeptical about the performance impact, though - frankly, on some
> of the hardware I've been running bcache on this would be a visible
> performance regression - probably double digit percentages but I'd have
> to benchmark it. That kind of of hardware/usage is not normal today,
> but I've put a lot of work into performance and I don't want to make
> things worse without good reason.
>
> Have you tested/benchmarked it?
>
> There's scheduling behaviour, too. We really want the workqueue thread's
> cpu time to be charged to the process that submitted the bio. (We could
> use a mechanism like that in other places, too... not like this is a new
> issue).
>
> This is going to be a real issue for users that need strong isolation -
> for any driver that uses non negligable cpu (i.e. dm crypt), we're
> breaking that (not that it wasn't broken already, but this makes it
> worse).

... or another possibility - start a timer when something is put to
current->bio_list and use that timer to pop entries off current->bio_list
and submit them to a workqueue. The timer can be cpu-local so only
interrupt masking is required to synchronize against the timer.

This would normally run just like the current kernel and in case of
deadlock, the timer would kick in and resolve the deadlock.

> I could be convinced, but right now I prefer my solution.

It fixes bio allocation problem, but not other similar mempool problems in
dm and md.

Mikulas

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 09-04-2012, 03:41 AM
Kent Overstreet
 
Default block: Avoid deadlocks with bio allocation by stacking drivers

On Mon, Sep 03, 2012 at 04:41:37PM -0400, Mikulas Patocka wrote:
> ... or another possibility - start a timer when something is put to
> current->bio_list and use that timer to pop entries off current->bio_list
> and submit them to a workqueue. The timer can be cpu-local so only
> interrupt masking is required to synchronize against the timer.
>
> This would normally run just like the current kernel and in case of
> deadlock, the timer would kick in and resolve the deadlock.

Ugh. That's a _terrible_ idea.

Remember the old plugging code? You ever have to debug performance
issues caused by it?

>
> > I could be convinced, but right now I prefer my solution.
>
> It fixes bio allocation problem, but not other similar mempool problems in
> dm and md.

I looked a bit more, and actually I think the rest of the problem is
pretty limited in scope - most of those mempool allocations are per
request, not per split.

I'm willing to put some time into converting dm/md over to bioset's
front_pad. I'm having to learn the code for the immutable biovec work,
anyways.

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 

Thread Tools




All times are GMT. The time now is 06:48 PM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org