FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Gentoo > Gentoo User

 
 
LinkBack Thread Tools
 
Old 02-13-2012, 09:49 AM
Helmut Jarausch
 
Default RFC : fast copying of a whole directory tree

Hi,

when copying a whole directory tree with standard tools, e.g.
tar cf - . | ( cd $DEST && tar xf - )
or cpio -p ...

the source disk is busy seeking. That's noisy and particularly slow.

I've written a small Python program which outputs the file names in
i-node order. If this is fed into tar or cpio nearly no seeks are
required during copying.

I've tested it by comparing the resulting copied tree to one created by
tar | tar.

But it's correctness for backing up data is critical.
Therefore I'd like to ask for comments.

Thanks for any comments,
Helmut.
 
Old 02-13-2012, 02:17 PM
Michael Orlitzky
 
Default RFC : fast copying of a whole directory tree

On 02/13/12 05:49, Helmut Jarausch wrote:
>
> I've written a small Python program which outputs the file names in
> i-node order. If this is fed into tar or cpio nearly no seeks are
> required during copying.

What makes you think the inodes are sequential on-disk?


> But it's correctness for backing up data is critical.
> Therefore I'd like to ask for comments.

You're nuts =)

Seriously though, use cp, tar, or rsync. They've seen years of use by
millions of people. All of the remaining bugs are sufficiently insidious
that you'll never hit them. The same probably isn't true for your script!
 
Old 02-13-2012, 02:31 PM
Grant Edwards
 
Default RFC : fast copying of a whole directory tree

On 2012-02-13, Michael Orlitzky <michael@orlitzky.com> wrote:
> On 02/13/12 05:49, Helmut Jarausch wrote:
>>
>> I've written a small Python program which outputs the file names in
>> i-node order. If this is fed into tar or cpio nearly no seeks are
>> required during copying.
>
> What makes you think the inodes are sequential on-disk?

Even if the i-nodes are sequential on-disk, there's no reason to think
that the data blocks associated with the inodes are in any particular
order with respect to the i-nodes themselves.

>> But it's correctness for backing up data is critical.
>> Therefore I'd like to ask for comments.
>
> You're nuts =)
>
> Seriously though, use cp, tar, or rsync. They've seen years of use by
> millions of people. All of the remaining bugs are sufficiently
> insidious that you'll never hit them. The same probably isn't true
> for your script!

--
Grant Edwards grant.b.edwards Yow! All this time I've
at been VIEWING a RUSSIAN
gmail.com MIDGET SODOMIZE a HOUSECAT!
 
Old 02-13-2012, 03:11 PM
 
Default RFC : fast copying of a whole directory tree

Grant Edwards <grant.b.edwards@gmail.com> wrote:

> On 2012-02-13, Michael Orlitzky <michael@orlitzky.com> wrote:
> > On 02/13/12 05:49, Helmut Jarausch wrote:
> >>
> >> I've written a small Python program which outputs the file names in
> >> i-node order. If this is fed into tar or cpio nearly no seeks are
> >> required during copying.
> >
> > What makes you think the inodes are sequential on-disk?
>
> Even if the i-nodes are sequential on-disk, there's no reason to think
> that the data blocks associated with the inodes are in any particular
> order with respect to the i-nodes themselves.

Correct, there is however a really fast method using "star -copy".

This works because there are two decoupled processes, shared memory between
them and the fact that star reads names from directories in one big chunk.

Jörg

--
EMail:joerg@schily.isdn.cs.tu-berlin.de (home) Jörg Schilling D-13353 Berlin
js@cs.tu-berlin.de (uni)
joerg.schilling@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/
URL: http://cdrecord.berlios.de/private/ ftp://ftp.berlios.de/pub/schily
 
Old 02-13-2012, 03:29 PM
Pandu Poluan
 
Default RFC : fast copying of a whole directory tree

On Feb 13, 2012 11:15 PM, "Joerg Schilling" <Joerg.Schilling@fokus.fraunhofer.de> wrote:

>

> Grant Edwards <grant.b.edwards@gmail.com> wrote:

>

> > On 2012-02-13, Michael Orlitzky <michael@orlitzky.com> wrote:

> > > On 02/13/12 05:49, Helmut Jarausch wrote:

> > >>

> > >> I've written a small Python program which outputs the file names in

> > >> i-node order. If this is fed into tar or cpio nearly no seeks are

> > >> required during copying.

> > >

> > > What makes you think the inodes are sequential on-disk?

> >

> > Even if the i-nodes are sequential on-disk, there's no reason to think

> > that the data blocks associated with the inodes are in any particular

> > order with respect to the i-nodes themselves.

>

> Correct, there is however a really fast method using "star -copy".

>

> This works because there are two decoupled processes, shared memory between

> them and the fact that star reads names from directories in one big chunk.

>


Honestly, that's news to me. Which package has star?


Rgds,
 
Old 02-13-2012, 03:37 PM
Nikos Chantziaras
 
Default RFC : fast copying of a whole directory tree

On 13/02/12 18:29, Pandu Poluan wrote:


On Feb 13, 2012 11:15 PM, "Joerg Schilling"
<Joerg.Schilling@fokus.fraunhofer.de
<mailto:Joerg.Schilling@fokus.fraunhofer.de>> wrote:
> Correct, there is however a really fast method using "star -copy".
>
> This works because there are two decoupled processes, shared memory
between
> them and the fact that star reads names from directories in one big
chunk.
>

Honestly, that's news to me. Which package has star?


eix -e star

:-/
 
Old 02-13-2012, 04:42 PM
Pandu Poluan
 
Default RFC : fast copying of a whole directory tree

On Feb 13, 2012 11:41 PM, "Nikos Chantziaras" <realnc@arcor.de> wrote:

>

> On 13/02/12 18:29, Pandu Poluan wrote:

>>

>>

>> On Feb 13, 2012 11:15 PM, "Joerg Schilling"

>> <Joerg.Schilling@fokus.fraunhofer.de

>> <mailto:Joerg.Schilling@fokus.fraunhofer.de>> wrote:

>> *> Correct, there is however a really fast method using "star -copy".

>> *>

>> *> This works because there are two decoupled processes, shared memory

>> between

>> *> them and the fact that star reads names from directories in one big

>> chunk.

>> *>

>>

>> Honestly, that's news to me. Which package has star?

>

>

> eix -e star

>

> :-/

>


Hehhe... sorry, I'm on the road and don't have Gentoo on my smartphone :-P


Rgds,
 
Old 02-13-2012, 05:20 PM
 
Default RFC : fast copying of a whole directory tree

Nikos Chantziaras <realnc@arcor.de> wrote:

> > > This works because there are two decoupled processes, shared memory
> > between
> > > them and the fact that star reads names from directories in one big
> > chunk.
> > >
> >
> > Honestly, that's news to me. Which package has star?
>
> eix -e star

To help star to buffer, give star a large fifo size that is up to haslf of the
RAM in your machine, e.g. fs=1000m

To make sure that star gives fast file creation (unpacking of archives) on
filesystems that do not support fast verified transactions, you need to make
star as "insecure" as other software to get comparable results, so add:

-no-fsync

Jörg

--
EMail:joerg@schily.isdn.cs.tu-berlin.de (home) Jörg Schilling D-13353 Berlin
js@cs.tu-berlin.de (uni)
joerg.schilling@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/
URL: http://cdrecord.berlios.de/private/ ftp://ftp.berlios.de/pub/schily
 
Old 02-13-2012, 09:11 PM
Dale
 
Default RFC : fast copying of a whole directory tree

Joerg Schilling wrote:
> Nikos Chantziaras <realnc@arcor.de> wrote:
>
>>> > This works because there are two decoupled processes, shared memory
>>> between
>>> > them and the fact that star reads names from directories in one big
>>> chunk.
>>> >
>>>
>>> Honestly, that's news to me. Which package has star?
>>
>> eix -e star
>
> To help star to buffer, give star a large fifo size that is up to haslf of the
> RAM in your machine, e.g. fs=1000m
>
> To make sure that star gives fast file creation (unpacking of archives) on
> filesystems that do not support fast verified transactions, you need to make
> star as "insecure" as other software to get comparable results, so add:
>
> -no-fsync
>
> Jörg
>


The problem with star is that when I need to copy a large number of
files, it isn't on the DVD I boot from. That's why most people use cp
since it is on every bootable media I have ever booted. That includes
the Gentoo bootable media.

Since star is so good, why not get them to include it on the bootable
media? Is it to large a package or what?

Dale

:-) :-)

--
I am only responsible for what I said ... Not for what you understood or
how you interpreted my words!

Miss the compile output? Hint:
EMERGE_DEFAULT_OPTS="--quiet-build=n"
 
Old 02-13-2012, 09:58 PM
Neil Bothwick
 
Default RFC : fast copying of a whole directory tree

On Tue, 14 Feb 2012 00:42:56 +0700, Pandu Poluan wrote:

> Hehhe... sorry, I'm on the road and don't have Gentoo on my
> smartphone :-P

Not even via SSH? :P


--
Neil Bothwick

If at first you don't succeed you'll get lots of advice.
 

Thread Tools




All times are GMT. The time now is 07:20 PM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright ©2007 - 2008, www.linux-archive.org