FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Redhat > Cluster Development

 
 
LinkBack Thread Tools
 
Old 02-20-2012, 01:33 PM
Bob Peterson
 
Default Minor speedup to ordered writes

Hi,

I don't know if you're going to like this idea or not, but
I'll submit it anyway:

This patch speeds up ordered writes a little bit by avoiding
some journal (log) locking and list moving.

If you compare the patched gfs2 module to the stock upstream one,
you can see slightly better IO throughput rates using a slow
local hard drive. I performed the test on the patched code first
and the stock gfs2.ko second in order to rule out affects due to
caching. In other words, if anything, the stock version should benefit
more from caching than the patched:

# insmod gfs2.ko
# mount -tgfs2 /dev/localhd/test /mnt/gfs2
# cd /mnt/gfs2
# /home/bob/simzone 1G | tail -6
filesize 1048576 KB recsize 32 KB IOs2do 32768 iwrite 57582 rewrite 61129 read 3220719 reread 3238394
filesize 1048576 KB recsize 64 KB IOs2do 16384 iwrite 63272 rewrite 59225 read 3354455 reread 3344153
filesize 1048576 KB recsize 128 KB IOs2do 8192 iwrite 63613 rewrite 59477 read 3370694 reread 3420704
filesize 1048576 KB recsize 256 KB IOs2do 4096 iwrite 58581 rewrite 64086 read 3255779 reread 3270381
filesize 1048576 KB recsize 512 KB IOs2do 2048 iwrite 62128 rewrite 58347 read 1659874 reread 1667349
filesize 1048576 KB recsize 1024 KB IOs2do 1024 iwrite 61503 rewrite 58816 read 1284272 reread 1284750
# cd -
/home/bob/gfs2-3.0.git/fs/gfs2.ordered
# umount /mnt/gfs2
# mkfs.gfs2 -O -p lock_dlm -t bob_cluster:lacie -j1 /dev/localhd/test &> /dev/null
# rmmod gfs2
# insmod ../gfs2.stock/gfs2.ko
# mount -tgfs2 /dev/localhd/test /mnt/gfs2
# cd /mnt/gfs2
# /home/bob/simzone | tail -6
filesize 1048576 KB recsize 32 KB IOs2do 32768 iwrite 58919 rewrite 45759 read 3231279 reread 3244637
filesize 1048576 KB recsize 64 KB IOs2do 16384 iwrite 62512 rewrite 56109 read 3323874 reread 3331139
filesize 1048576 KB recsize 128 KB IOs2do 8192 iwrite 56083 rewrite 49060 read 3401618 reread 3409803
filesize 1048576 KB recsize 256 KB IOs2do 4096 iwrite 62669 rewrite 56804 read 3290692 reread 3300096
filesize 1048576 KB recsize 512 KB IOs2do 2048 iwrite 51005 rewrite 59375 read 1677764 reread 1682165
filesize 1048576 KB recsize 1024 KB IOs2do 1024 iwrite 57374 rewrite 52122 read 1280357 reread 1278381

Regards,

Bob Peterson
Red Hat File Systems

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
--
fs/gfs2/log.c | 15 +++++++--------
1 files changed, 7 insertions(+), 8 deletions(-)

diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
index b8fe7b7..34c93ad 100644
--- a/fs/gfs2/log.c
+++ b/fs/gfs2/log.c
@@ -585,18 +585,17 @@ static void gfs2_ordered_write(struct gfs2_sbd *sdp)
{
struct gfs2_bufdata *bd;
struct buffer_head *bh;
- LIST_HEAD(written);
+ LIST_HEAD(towrite);

gfs2_log_lock(sdp);
- list_sort(NULL, &sdp->sd_log_le_ordered, &bd_cmp);
- while (!list_empty(&sdp->sd_log_le_ordered)) {
- bd = list_entry(sdp->sd_log_le_ordered.next, struct gfs2_bufdata, bd_le.le_list);
- list_move(&bd->bd_le.le_list, &written);
+ list_move(&sdp->sd_log_le_ordered, &towrite);
+ gfs2_log_unlock(sdp);
+ list_sort(NULL, &towrite, &bd_cmp);
+ list_for_each_entry(bd, &towrite, bd_le.le_list) {
bh = bd->bd_bh;
if (!buffer_dirty(bh))
continue;
get_bh(bh);
- gfs2_log_unlock(sdp);
lock_buffer(bh);
if (buffer_mapped(bh) && test_clear_buffer_dirty(bh)) {
bh->b_end_io = end_buffer_write_sync;
@@ -605,9 +604,9 @@ static void gfs2_ordered_write(struct gfs2_sbd *sdp)
unlock_buffer(bh);
brelse(bh);
}
- gfs2_log_lock(sdp);
}
- list_splice(&written, &sdp->sd_log_le_ordered);
+ gfs2_log_lock(sdp);
+ list_splice(&towrite, &sdp->sd_log_le_ordered);
gfs2_log_unlock(sdp);
}
 
Old 02-20-2012, 01:40 PM
Steven Whitehouse
 
Default Minor speedup to ordered writes

Hi,

On Mon, 2012-02-20 at 09:33 -0500, Bob Peterson wrote:
> Hi,
>
> I don't know if you're going to like this idea or not, but
> I'll submit it anyway:
>
> This patch speeds up ordered writes a little bit by avoiding
> some journal (log) locking and list moving.
>
> If you compare the patched gfs2 module to the stock upstream one,
> you can see slightly better IO throughput rates using a slow
> local hard drive. I performed the test on the patched code first
> and the stock gfs2.ko second in order to rule out affects due to
> caching. In other words, if anything, the stock version should benefit
> more from caching than the patched:
>
> # insmod gfs2.ko
> # mount -tgfs2 /dev/localhd/test /mnt/gfs2
> # cd /mnt/gfs2
> # /home/bob/simzone 1G | tail -6
> filesize 1048576 KB recsize 32 KB IOs2do 32768 iwrite 57582 rewrite 61129 read 3220719 reread 3238394
> filesize 1048576 KB recsize 64 KB IOs2do 16384 iwrite 63272 rewrite 59225 read 3354455 reread 3344153
> filesize 1048576 KB recsize 128 KB IOs2do 8192 iwrite 63613 rewrite 59477 read 3370694 reread 3420704
> filesize 1048576 KB recsize 256 KB IOs2do 4096 iwrite 58581 rewrite 64086 read 3255779 reread 3270381
> filesize 1048576 KB recsize 512 KB IOs2do 2048 iwrite 62128 rewrite 58347 read 1659874 reread 1667349
> filesize 1048576 KB recsize 1024 KB IOs2do 1024 iwrite 61503 rewrite 58816 read 1284272 reread 1284750
> # cd -
> /home/bob/gfs2-3.0.git/fs/gfs2.ordered
> # umount /mnt/gfs2
> # mkfs.gfs2 -O -p lock_dlm -t bob_cluster:lacie -j1 /dev/localhd/test &> /dev/null
> # rmmod gfs2
> # insmod ../gfs2.stock/gfs2.ko
> # mount -tgfs2 /dev/localhd/test /mnt/gfs2
> # cd /mnt/gfs2
> # /home/bob/simzone | tail -6
> filesize 1048576 KB recsize 32 KB IOs2do 32768 iwrite 58919 rewrite 45759 read 3231279 reread 3244637
> filesize 1048576 KB recsize 64 KB IOs2do 16384 iwrite 62512 rewrite 56109 read 3323874 reread 3331139
> filesize 1048576 KB recsize 128 KB IOs2do 8192 iwrite 56083 rewrite 49060 read 3401618 reread 3409803
> filesize 1048576 KB recsize 256 KB IOs2do 4096 iwrite 62669 rewrite 56804 read 3290692 reread 3300096
> filesize 1048576 KB recsize 512 KB IOs2do 2048 iwrite 51005 rewrite 59375 read 1677764 reread 1682165
> filesize 1048576 KB recsize 1024 KB IOs2do 1024 iwrite 57374 rewrite 52122 read 1280357 reread 1278381
>
> Regards,
>
> Bob Peterson
> Red Hat File Systems
>
> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
> --

I like the idea of not having to hold the lock for so long, but what
protects the list again the buffers being reclaimed? So far as I can
tell, that could happen at any time after the i/o has completed and the
buffer is clean,

Steve.


> fs/gfs2/log.c | 15 +++++++--------
> 1 files changed, 7 insertions(+), 8 deletions(-)
>
> diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
> index b8fe7b7..34c93ad 100644
> --- a/fs/gfs2/log.c
> +++ b/fs/gfs2/log.c
> @@ -585,18 +585,17 @@ static void gfs2_ordered_write(struct gfs2_sbd *sdp)
> {
> struct gfs2_bufdata *bd;
> struct buffer_head *bh;
> - LIST_HEAD(written);
> + LIST_HEAD(towrite);
>
> gfs2_log_lock(sdp);
> - list_sort(NULL, &sdp->sd_log_le_ordered, &bd_cmp);
> - while (!list_empty(&sdp->sd_log_le_ordered)) {
> - bd = list_entry(sdp->sd_log_le_ordered.next, struct gfs2_bufdata, bd_le.le_list);
> - list_move(&bd->bd_le.le_list, &written);
> + list_move(&sdp->sd_log_le_ordered, &towrite);
> + gfs2_log_unlock(sdp);
> + list_sort(NULL, &towrite, &bd_cmp);
> + list_for_each_entry(bd, &towrite, bd_le.le_list) {
> bh = bd->bd_bh;
> if (!buffer_dirty(bh))
> continue;
> get_bh(bh);
> - gfs2_log_unlock(sdp);
> lock_buffer(bh);
> if (buffer_mapped(bh) && test_clear_buffer_dirty(bh)) {
> bh->b_end_io = end_buffer_write_sync;
> @@ -605,9 +604,9 @@ static void gfs2_ordered_write(struct gfs2_sbd *sdp)
> unlock_buffer(bh);
> brelse(bh);
> }
> - gfs2_log_lock(sdp);
> }
> - list_splice(&written, &sdp->sd_log_le_ordered);
> + gfs2_log_lock(sdp);
> + list_splice(&towrite, &sdp->sd_log_le_ordered);
> gfs2_log_unlock(sdp);
> }
>
>
 
Old 02-20-2012, 02:11 PM
Bob Peterson
 
Default Minor speedup to ordered writes

----- Original Message -----
| Hi,
|
| I like the idea of not having to hold the lock for so long, but what
| protects the list again the buffers being reclaimed? So far as I can
| tell, that could happen at any time after the i/o has completed and
| the
| buffer is clean,
|
| Steve.

Hi,

The sd_log_le_ordered list is still protected by the log_lock.

The first thing we do is isolate the sd_log_le_ordered list to a
private list_head to protect it.

The working bd list, towrite, is protected by virtue of the fact that
the list head is a private variable. The reclaims might reclaim some of
the bh, but the bd list shouldn't be manipulated by anyone but the
function it's declared in.

Unless I'm totally missing what you mean.

Regards,

Bob Peterson
Red Hat File Systems
 
Old 02-20-2012, 02:21 PM
Steven Whitehouse
 
Default Minor speedup to ordered writes

Hi,

On Mon, 2012-02-20 at 10:11 -0500, Bob Peterson wrote:
> ----- Original Message -----
> | Hi,
> |
> | I like the idea of not having to hold the lock for so long, but what
> | protects the list again the buffers being reclaimed? So far as I can
> | tell, that could happen at any time after the i/o has completed and
> | the
> | buffer is clean,
> |
> | Steve.
>
> Hi,
>
> The sd_log_le_ordered list is still protected by the log_lock.
>
> The first thing we do is isolate the sd_log_le_ordered list to a
> private list_head to protect it.
>
> The working bd list, towrite, is protected by virtue of the fact that
> the list head is a private variable. The reclaims might reclaim some of
> the bh, but the bd list shouldn't be manipulated by anyone but the
> function it's declared in.
>
> Unless I'm totally missing what you mean.
>
> Regards,
>
> Bob Peterson
> Red Hat File Systems

At page reclaim time, the log lock is used to protect the removal of the
bd from the list. It doesn't matter which list the bd is on, it will be
removed from it just the same, see gfs2_releasepage()

Steve.
 
Old 02-20-2012, 02:43 PM
Bob Peterson
 
Default Minor speedup to ordered writes

----- Original Message -----
| Hi,
|
| At page reclaim time, the log lock is used to protect the removal of
| the
| bd from the list. It doesn't matter which list the bd is on, it will
| be
| removed from it just the same, see gfs2_releasepage()
|
| Steve.

Ah, I understand what you mean now. I don't see a "good" way to do it
then, so I guess I withdraw my patch.

Regards,

Bob Peterson
Red Hat File Systems
 

Thread Tools




All times are GMT. The time now is 06:42 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org