backing-dev: use synchronize_rcu_expedited instead of synchronize_rcu
Hi Jens
Please would you consider taking this into the block tree? It seems to
speed up device deletion enormously.
Mikulas
---
backing-dev: use synchronize_rcu_expedited instead of synchronize_rcu
synchronize_rcu sleeps several timer ticks. synchronize_rcu_expedited is
much faster.
With 100Hz timer frequency, when we remove 10000 block devices with
"dmsetup remove_all" command, it takes 27 minutes. With this patch,
removing 10000 block devices takes only 15 seconds.
int bdi_register(struct backing_dev_info *bdi, struct device *parent,
--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
07-21-2011, 07:27 AM
Jens Axboe
backing-dev: use synchronize_rcu_expedited instead of synchronize_rcu
On 2011-07-21 02:29, Mikulas Patocka wrote:
> Hi Jens
>
> Please would you consider taking this into the block tree? It seems to
> speed up device deletion enormously.
Sure, looks like a good fix. Reminds me of a similar problem we had in
block core.
--
Jens Axboe
--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
01-31-2012, 07:34 PM
Peter Zijlstra
backing-dev: use synchronize_rcu_expedited instead of synchronize_rcu
On Wed, 2011-07-20 at 20:29 -0400, Mikulas Patocka wrote:
> Hi Jens
>
> Please would you consider taking this into the block tree? It seems to
> speed up device deletion enormously.
>
> Mikulas
>
> ---
>
> backing-dev: use synchronize_rcu_expedited instead of synchronize_rcu
>
> synchronize_rcu sleeps several timer ticks. synchronize_rcu_expedited is
> much faster.
>
> With 100Hz timer frequency, when we remove 10000 block devices with
> "dmsetup remove_all" command, it takes 27 minutes. With this patch,
> removing 10000 block devices takes only 15 seconds.
>
> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
>
> ---
> mm/backing-dev.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> Index: linux-3.0-rc7-fast/mm/backing-dev.c
> ================================================== =================
> --- linux-3.0-rc7-fast.orig/mm/backing-dev.c 2011-07-19 18:01:00.000000000 +0200
> +++ linux-3.0-rc7-fast/mm/backing-dev.c 2011-07-19 18:01:07.000000000 +0200
> @@ -505,7 +505,7 @@ static void bdi_remove_from_list(struct
> list_del_rcu(&bdi->bdi_list);
> spin_unlock_bh(&bdi_lock);
>
> - synchronize_rcu();
> + synchronize_rcu_expedited();
> }
>
Urgh, I just noticed this crap in my tree.. You realize that what you're
effectively hammering a global sync primitive this way? Depending on
what RCU flavour you have any SMP variant will at least do a machine
wide IPI broadcast for every sync_rcu_exp(), some do significantly more.
The much better solution would've been to batch your block-dev removals
and use a single sync_rcu as barrier.
This is not cool.
--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
01-31-2012, 08:04 PM
"Paul E. McKenney"
backing-dev: use synchronize_rcu_expedited instead of synchronize_rcu
On Tue, Jan 31, 2012 at 09:34:23PM +0100, Peter Zijlstra wrote:
> On Wed, 2011-07-20 at 20:29 -0400, Mikulas Patocka wrote:
> > Hi Jens
> >
> > Please would you consider taking this into the block tree? It seems to
> > speed up device deletion enormously.
> >
> > Mikulas
> >
> > ---
> >
> > backing-dev: use synchronize_rcu_expedited instead of synchronize_rcu
> >
> > synchronize_rcu sleeps several timer ticks. synchronize_rcu_expedited is
> > much faster.
> >
> > With 100Hz timer frequency, when we remove 10000 block devices with
> > "dmsetup remove_all" command, it takes 27 minutes. With this patch,
> > removing 10000 block devices takes only 15 seconds.
> >
> > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> >
> > ---
> > mm/backing-dev.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > Index: linux-3.0-rc7-fast/mm/backing-dev.c
> > ================================================== =================
> > --- linux-3.0-rc7-fast.orig/mm/backing-dev.c 2011-07-19 18:01:00.000000000 +0200
> > +++ linux-3.0-rc7-fast/mm/backing-dev.c 2011-07-19 18:01:07.000000000 +0200
> > @@ -505,7 +505,7 @@ static void bdi_remove_from_list(struct
> > list_del_rcu(&bdi->bdi_list);
> > spin_unlock_bh(&bdi_lock);
> >
> > - synchronize_rcu();
> > + synchronize_rcu_expedited();
> > }
> >
>
> Urgh, I just noticed this crap in my tree.. You realize that what you're
> effectively hammering a global sync primitive this way? Depending on
> what RCU flavour you have any SMP variant will at least do a machine
> wide IPI broadcast for every sync_rcu_exp(), some do significantly more.
>
> The much better solution would've been to batch your block-dev removals
> and use a single sync_rcu as barrier.
>
> This is not cool.
Indeed, synchronize_rcu_expedited() is quite heavyweight, so as Peter
suggests, if you can use batching you will get even better performance
with much less load on the rest of the system.
Thanx, Paul
--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
02-02-2012, 07:43 PM
Mikulas Patocka
backing-dev: use synchronize_rcu_expedited instead of synchronize_rcu
On Tue, 31 Jan 2012, Peter Zijlstra wrote:
> On Wed, 2011-07-20 at 20:29 -0400, Mikulas Patocka wrote:
> > Hi Jens
> >
> > Please would you consider taking this into the block tree? It seems to
> > speed up device deletion enormously.
> >
> > Mikulas
> >
> > ---
> >
> > backing-dev: use synchronize_rcu_expedited instead of synchronize_rcu
> >
> > synchronize_rcu sleeps several timer ticks. synchronize_rcu_expedited is
> > much faster.
> >
> > With 100Hz timer frequency, when we remove 10000 block devices with
> > "dmsetup remove_all" command, it takes 27 minutes. With this patch,
> > removing 10000 block devices takes only 15 seconds.
> >
> > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> >
> > ---
> > mm/backing-dev.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > Index: linux-3.0-rc7-fast/mm/backing-dev.c
> > ================================================== =================
> > --- linux-3.0-rc7-fast.orig/mm/backing-dev.c 2011-07-19 18:01:00.000000000 +0200
> > +++ linux-3.0-rc7-fast/mm/backing-dev.c 2011-07-19 18:01:07.000000000 +0200
> > @@ -505,7 +505,7 @@ static void bdi_remove_from_list(struct
> > list_del_rcu(&bdi->bdi_list);
> > spin_unlock_bh(&bdi_lock);
> >
> > - synchronize_rcu();
> > + synchronize_rcu_expedited();
> > }
> >
>
> Urgh, I just noticed this crap in my tree.. You realize that what you're
> effectively hammering a global sync primitive this way? Depending on
> what RCU flavour you have any SMP variant will at least do a machine
> wide IPI broadcast for every sync_rcu_exp(), some do significantly more.
>
> The much better solution would've been to batch your block-dev removals
> and use a single sync_rcu as barrier.
>
> This is not cool.
Do you have some measurable use case where the user is removing block
devices so heavily that this causes a problem?
Mikulas
--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
02-02-2012, 08:59 PM
Peter Zijlstra
backing-dev: use synchronize_rcu_expedited instead of synchronize_rcu
On Thu, 2012-02-02 at 15:43 -0500, Mikulas Patocka wrote:
> Do you have some measurable use case where the user is removing block
> devices so heavily that this causes a problem?
Even one can be a problem, we're having people spend lots of time and
effort to reduce machine wide jitter and interference. Adding it with
such disregard isn't cool.
There's no reason a management cpu adding or removing block devices
should perturb the high-freq trading or industrial laser control running
on the other side of the machine.
--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
02-02-2012, 11:29 PM
"Paul E. McKenney"
backing-dev: use synchronize_rcu_expedited instead of synchronize_rcu
On Thu, Feb 02, 2012 at 10:59:04PM +0100, Peter Zijlstra wrote:
> On Thu, 2012-02-02 at 15:43 -0500, Mikulas Patocka wrote:
> > Do you have some measurable use case where the user is removing block
> > devices so heavily that this causes a problem?
>
> Even one can be a problem, we're having people spend lots of time and
> effort to reduce machine wide jitter and interference. Adding it with
> such disregard isn't cool.
>
> There's no reason a management cpu adding or removing block devices
> should perturb the high-freq trading or industrial laser control running
> on the other side of the machine.
Very true for real-time applications!
For the heavy trading apps, given Frederic's upcoming user-mode-idle work,
I can keep this stuff from perturbing the apps. Still, batching would
be preferable.
Thanx, Paul
--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel