FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Gentoo > Gentoo User

 
 
LinkBack Thread Tools
 
Old 05-24-2008, 09:49 PM
 
Default Speed up `du'

Is there any way to speed up the du command? I mean short of having
cron run it on target directories and store results. (not really
speeding up but at least not having to wait for a result)

I've seen various mention of du being slow but don't recall any
mentions of how to speed it up.

I use Reiserfs with default sizes. In some situations like a large
cache of nntp messages of several GB. I might wait 5-10 minutes or more
for du to get the size of the directory.

Are there other file systems that can return a result of `du' faster?

I'm curious how `df' computes sizes so much quicker. Even after
rm'ing a large amount of data... `df' sees the difference right away.

Or maybe there is some other tool or technique that can quickly tell
me the size of a directory or set of directories.

--
gentoo-user@lists.gentoo.org mailing list
 
Old 05-24-2008, 10:22 PM
Allan Gottlieb
 
Default Speed up `du'

At Sat, 24 May 2008 16:49:09 -0500 reader@newsguy.com wrote:

> Is there any way to speed up the du command? I mean short of having
> cron run it on target directories and store results. (not really
> speeding up but at least not having to wait for a result)
>
> I've seen various mention of du being slow but don't recall any
> mentions of how to speed it up.
>
> I use Reiserfs with default sizes. In some situations like a large
> cache of nntp messages of several GB. I might wait 5-10 minutes or more
> for du to get the size of the directory.
>
> Are there other file systems that can return a result of `du' faster?
>
> I'm curious how `df' computes sizes so much quicker. Even after
> rm'ing a large amount of data... `df' sees the difference right away.

I can't help with speeding up du, but can explain df's speed.
This information is kept in the superblock. Each operation that
changes size updates the superblock and df just reads the result.
(In a sense it is like your cron soln above for du :-) .)

> Or maybe there is some other tool or technique that can quickly tell
> me the size of a directory or set of directories.

allan
--
gentoo-user@lists.gentoo.org mailing list
 
Old 05-24-2008, 11:24 PM
Willie Wong
 
Default Speed up `du'

On Sat, May 24, 2008 at 04:49:09PM -0500, Penguin Lover reader@newsguy.com squawked:
> Is there any way to speed up the du command? I mean short of having
> cron run it on target directories and store results. (not really
> speeding up but at least not having to wait for a result)
>
> I've seen various mention of du being slow but don't recall any
> mentions of how to speed it up.
>
> I use Reiserfs with default sizes. In some situations like a large
> cache of nntp messages of several GB. I might wait 5-10 minutes or more
> for du to get the size of the directory.
>
> Are there other file systems that can return a result of `du' faster?
>
> I'm curious how `df' computes sizes so much quicker. Even after
> rm'ing a large amount of data... `df' sees the difference right away.
>
> Or maybe there is some other tool or technique that can quickly tell
> me the size of a directory or set of directories.

I am pretty sure the problem with du is that it actually looks,
recursively, at every single file and computes the size that way. So
the time you have to wait is mostly due to disk IO (and caching would
also explain why if you run du twice in a row the answer returns much
more quickly). So, if you know what the bottle-neck directory is (for
example, the directory of nntp messages), the tricks in

http://gentoo-wiki.com/TIP_Speeding_up_portage

should probably work just as well.

HTH,

W
--
"You're very sure of your facts, " he said at last, "I
couldn't trust the thinking of a man who takes the Universe
- if there is one - for granted. "
Sortir en Pantoufles: up 533 days, 21:55
--
gentoo-user@lists.gentoo.org mailing list
 
Old 05-25-2008, 01:54 AM
Stroller
 
Default Speed up `du'

On 25 May 2008, at 00:24, Willie Wong wrote:

On Sat, May 24, 2008 at 04:49:09PM -0500, Penguin Lover
reader@newsguy.com squawked:

...
I use Reiserfs with default sizes. In some situations like a large
cache of nntp messages of several GB. I might wait 5-10 minutes
or more

for du to get the size of the directory.


I am pretty sure the problem with du is that it actually looks,
recursively, at every single file and computes the size that way.


What he said.


Or maybe there is some other tool or technique that can quickly tell
me the size of a directory or set of directories.


Keep all the files in a honkin' big tarball.
:P
If you need to read these files on the fly then I'm afraid you'll
have to write a kernel filesystem extension (or find one?) that will
read them out of the tar file, slowing all read & write actions down.
But, hey, `du` on the tarball will complete in no time at all!!


In seriousness, another thing to do is keep these files on a separate
partition, if you can. Basically a user's ~ which includes
both .maildir and "My HiDef Videos" is non-optimal.



Are there other file systems that can return a result of `du' faster?



All filesystems have their advantages & disadvantages.

<http://www.debian-administration.org/articles/388>
Reading the above I _think_ the test most similar in function to
running `du` on many small files is the "Directory listing and file
search into the previous file tree" test, at which ResiderFS is fastest.


I need to look into this myself soon, to try & get best speed at a
3gig corpus of email. I was expecting EXT3 to be best - when you
create the filesystem you can specify the blocksize. It's possible
that the author of the filesystems comparison could have chosen
options when formatting his EXT3 disk that affected the speed of the
results - a journal would make writes slower, for instance (not sure
about reads).


Stroller.
--
gentoo-user@lists.gentoo.org mailing list
 
Old 05-25-2008, 02:56 AM
"Hemmann, Volker Armin"
 
Default Speed up `du'

On Sonntag, 25. Mai 2008, Stroller wrote:
> On 25 May 2008, at 00:24, Willie Wong wrote:
> > On Sat, May 24, 2008 at 04:49:09PM -0500, Penguin Lover
> >
> > reader@newsguy.com squawked:
> >> ...
> >> I use Reiserfs with default sizes. In some situations like a large
> >> cache of nntp messages of several GB. I might wait 5-10 minutes
> >> or more
> >> for du to get the size of the directory.
> >
> > I am pretty sure the problem with du is that it actually looks,
> > recursively, at every single file and computes the size that way.
>
> What he said.
>
> > Or maybe there is some other tool or technique that can quickly tell
> > me the size of a directory or set of directories.
>
> Keep all the files in a honkin' big tarball.
>
> :P
>
> If you need to read these files on the fly then I'm afraid you'll
> have to write a kernel filesystem extension (or find one?) that will
> read them out of the tar file, slowing all read & write actions down.
> But, hey, `du` on the tarball will complete in no time at all!!
>
> In seriousness, another thing to do is keep these files on a separate
> partition, if you can. Basically a user's ~ which includes
> both .maildir and "My HiDef Videos" is non-optimal.
>
> >> Are there other file systems that can return a result of `du' faster?
>
> All filesystems have their advantages & disadvantages.
>
> <http://www.debian-administration.org/articles/388>

one thing the article does not mention:

reiserfs and xfs your barriers by default.

ext3 does not. And if you turn on barriers (as mount option) you loose 30% of
its speed.

Of course, if you care about data integrity, LVM is ruled out too - for the
same reason.

So if you care about data integrity and speed at the same time, ext3 is ruled
out. XFS is broken on a monthly basis (just search the lkml archives for
xfs. It is sickening). Leaves reiserfs as only sane choice.
--
gentoo-user@lists.gentoo.org mailing list
 
Old 05-25-2008, 01:04 PM
Stroller
 
Default Speed up `du'

On 25 May 2008, at 03:56, Hemmann, Volker Armin wrote:

...
reiserfs and xfs your barriers by default.


This sentence no parse.

Stroller.
--
gentoo-user@lists.gentoo.org mailing list
 

Thread Tools




All times are GMT. The time now is 10:03 PM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org