FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Gentoo > Gentoo User

 
 
LinkBack Thread Tools
 
Old 06-05-2010, 06:39 AM
 
Default Fast checksumming of whole partitions

Hi,

this night dd copies the contents of my first
1TB disk to my second 1TB disk (same Model).

(dd if=/devsda of=/dev/sdb bs=4096)

I want to verify, that the copy is identical.

I tried (or: I am still trying) to checksum
the first disk with

whirlpooldeep /dev/sda

whch seems to work but is DAMN slow (in relation
to checksumming 1TB in whole).

Is there any faster and reliable way to checksum
whole paritions (not on "per file" base)???

Thank you very much in advance for any help!

Best regards,
mcc

--
Please don't send me any Word- or Powerpoint-Attachments
unless it's absolutely neccessary. - Send simply Text.
See http://www.gnu.org/philosophy/no-word-attachments.html
In a world without fences and walls nobody needs gates and windows.
 
Old 06-05-2010, 07:19 AM
Nikos Chantziaras
 
Default Fast checksumming of whole partitions

On 06/05/2010 09:39 AM, meino.cramer@gmx.de wrote:


Hi,

this night dd copies the contents of my first
1TB disk to my second 1TB disk (same Model).

(dd if=/devsda of=/dev/sdb bs=4096)

I want to verify, that the copy is identical.

I tried (or: I am still trying) to checksum
the first disk with

whirlpooldeep /dev/sda

whch seems to work but is DAMN slow (in relation
to checksumming 1TB in whole).

Is there any faster and reliable way to checksum
whole paritions (not on "per file" base)???

Thank you very much in advance for any help!

Best regards,
mcc


Constructing a checksum means reading every byte off the partition. So
it's slower as a copy to /dev/null, never faster (because the checksum
calculation also needs time.)


So in order to determine whether it's really slow, compare the time
needed to dd the whole partition to /dev/null to the time needed for
checksumming it. Then post the times here and an expert might then tell
whether this can be improved at all or not.
 
Old 06-05-2010, 07:32 AM
Andrea Conti
 
Default Fast checksumming of whole partitions

> Is there any faster and reliable way to checksum
> whole paritions (not on "per file" base)???

It depends on where your bottleneck is...

If you're cpu-bound you can try with a faster hash: md5sum or even
md4sum would be a good choice (collision resistance is irrelevant in
this application).

On the other hand, if you're limited by disk throughput (which is most
likely) there is not much you can do. After all, you have to read the
data in order to hash it, and that takes time.

andrea
 
Old 06-05-2010, 05:39 PM
7v5w7go9ub0o
 
Default Fast checksumming of whole partitions

On 06/05/10 02:39, meino.cramer@gmx.de wrote:
[]
>
> Is there any faster and reliable way to checksum whole paritions (not
> on "per file" base)???


FWIW, portage has a tool called "dcfldd" that works well for me. It is
dd with the addition of:

* Hashing on-the-fly - dcfldd can hash the input data as it is
being transferred, helping to ensure data integrity.
* Status output - dcfldd can update the user of its progress in
terms of the amount of data transferred and how much longer operation
will take.
* Flexible disk wipes - dcfldd can be used to wipe disks quickly and
with a known pattern if desired.
* Image/wipe Verify - dcfldd can verify that a target drive is a
bit-for-bit match of the specified input file or pattern.
* Multiple outputs - dcfldd can output to multiple files or disks at
the same time.
* Split output - dcfldd can split output to multiple files with more
configurability than the split command.
* Piped output and logs - dcfldd can send all its log data and
output to commands as well as files natively.


e.g. when I copy my HD, I get a copy status report and hash by using the
following commands:

#!/bin/bash
dcfldd if=/dev/sda bs=4096k sizeprobe=if status=on hashwindow=0 of=/dev/sdb
dcfldd if=/dev/sdb bs=4096k sizeprobe=if status=on hashwindow=0 of=/dev/null

When they've completed, I'll visually compare the two hashes (you can
automate this.) You can get fancier and do the Verify instead of the hashes.

HTH

(p.s. Part of your answer is setting the best blocksize for dd or
dcfldd.

I'd presume it the smaller of your available memory, or the buffer size
on your HD?...... someone please correct me on this!?)
 
Old 06-05-2010, 07:23 PM
 
Default Fast checksumming of whole partitions

7v5w7go9ub0o <7v5w7go9ub0o@gmail.com> [10-06-05 20:22]:
> On 06/05/10 02:39, meino.cramer@gmx.de wrote:
> []
> >
> > Is there any faster and reliable way to checksum whole paritions (not
> > on "per file" base)???
>
>
> FWIW, portage has a tool called "dcfldd" that works well for me. It is
> dd with the addition of:
>
> * Hashing on-the-fly - dcfldd can hash the input data as it is
> being transferred, helping to ensure data integrity.
> * Status output - dcfldd can update the user of its progress in
> terms of the amount of data transferred and how much longer operation
> will take.
> * Flexible disk wipes - dcfldd can be used to wipe disks quickly and
> with a known pattern if desired.
> * Image/wipe Verify - dcfldd can verify that a target drive is a
> bit-for-bit match of the specified input file or pattern.
> * Multiple outputs - dcfldd can output to multiple files or disks at
> the same time.
> * Split output - dcfldd can split output to multiple files with more
> configurability than the split command.
> * Piped output and logs - dcfldd can send all its log data and
> output to commands as well as files natively.
>
>
> e.g. when I copy my HD, I get a copy status report and hash by using the
> following commands:
>
> #!/bin/bash
> dcfldd if=/dev/sda bs=4096k sizeprobe=if status=on hashwindow=0 of=/dev/sdb
> dcfldd if=/dev/sdb bs=4096k sizeprobe=if status=on hashwindow=0 of=/dev/null
>
> When they've completed, I'll visually compare the two hashes (you can
> automate this.) You can get fancier and do the Verify instead of the hashes.
>
> HTH
>
> (p.s. Part of your answer is setting the best blocksize for dd or
> dcfldd.
>
> I'd presume it the smaller of your available memory, or the buffer size
> on your HD?...... someone please correct me on this!?)
>
>

That looks really interesting. The only problem I have with this is
that I have to have /dev/sda as /dev/sdb idle (not mounted) and
because of that I use knoppix as temporary system to boot. And I
dont think that knoppix has this tool "on board".

Or is there a way to do such copies from a one disk to another
while one disk is booted???

Best regards,
mcc

--
Please don't send me any Word- or Powerpoint-Attachments
unless it's absolutely neccessary. - Send simply Text.
See http://www.gnu.org/philosophy/no-word-attachments.html
In a world without fences and walls nobody needs gates and windows.
 
Old 06-05-2010, 08:11 PM
Manuel Klemenz
 
Default Fast checksumming of whole partitions

I'm calculating checksums over partitions just by calling
# md5sum /dev/sda1
or for the complete disk (incl. partition table + all partitions)
# md5sum /dev/sda

that's it - works with any distro/liveDVD

--
Cheers,
Manuel Klemenz

On Saturday 05 June 2010 21:23:31 meino.cramer@gmx.de wrote:
> 7v5w7go9ub0o <7v5w7go9ub0o@gmail.com> [10-06-05 20:22]:
> > On 06/05/10 02:39, meino.cramer@gmx.de wrote:
> > []
> >
> > > Is there any faster and reliable way to checksum whole paritions (not
> > > on "per file" base)???
> >
> > FWIW, portage has a tool called "dcfldd" that works well for me. It is
> >
> > dd with the addition of:
> > * Hashing on-the-fly - dcfldd can hash the input data as it is
> >
> > being transferred, helping to ensure data integrity.
> >
> > * Status output - dcfldd can update the user of its progress in
> >
> > terms of the amount of data transferred and how much longer operation
> > will take.
> >
> > * Flexible disk wipes - dcfldd can be used to wipe disks quickly
> > and
> >
> > with a known pattern if desired.
> >
> > * Image/wipe Verify - dcfldd can verify that a target drive is a
> >
> > bit-for-bit match of the specified input file or pattern.
> >
> > * Multiple outputs - dcfldd can output to multiple files or disks
> > at
> >
> > the same time.
> >
> > * Split output - dcfldd can split output to multiple files with
> > more
> >
> > configurability than the split command.
> >
> > * Piped output and logs - dcfldd can send all its log data and
> >
> > output to commands as well as files natively.
> >
> >
> > e.g. when I copy my HD, I get a copy status report and hash by using the
> > following commands:
> >
> > #!/bin/bash
> > dcfldd if=/dev/sda bs=4096k sizeprobe=if status=on hashwindow=0
> > of=/dev/sdb dcfldd if=/dev/sdb bs=4096k sizeprobe=if status=on
> > hashwindow=0 of=/dev/null
> >
> > When they've completed, I'll visually compare the two hashes (you can
> > automate this.) You can get fancier and do the Verify instead of the
> > hashes.
> >
> > HTH
> >
> > (p.s. Part of your answer is setting the best blocksize for dd or
> > dcfldd.
> >
> > I'd presume it the smaller of your available memory, or the buffer size
> > on your HD?...... someone please correct me on this!?)
>
> That looks really interesting. The only problem I have with this is
> that I have to have /dev/sda as /dev/sdb idle (not mounted) and
> because of that I use knoppix as temporary system to boot. And I
> dont think that knoppix has this tool "on board".
>
> Or is there a way to do such copies from a one disk to another
> while one disk is booted???
>
> Best regards,
> mcc
 
Old 06-05-2010, 11:44 PM
7v5w7go9ub0o
 
Default Fast checksumming of whole partitions

On 06/05/10 15:23, meino.cramer@gmx.de wrote:

[]
> That looks really interesting. The only problem I have with this is
> that I have to have /dev/sda as /dev/sdb idle (not mounted) and
> because of that I use knoppix as temporary system to boot. And I dont
> think that knoppix has this tool "on board".

Just boot up knoppix, mount root partition that contains dcfldd, go to
wherever the executable is located (e.g. /usr/bin/dcfldd):

1. boot up knoppix
2. create a partition: mkdir /work
3. mount /work to the root partition: mount /dev/sdc /work
4. cd /work/usr/bin
5. run dcfldd: ./dcfldd

If your root partition is encrypted (e.g. mine is), then place a copy of
dcfldd on the boot partition; no boot partition, put a copy on its own
dedicated little partition.

Of course, you can always put a copy on a USB jumpdrive. As a last
alternative, download and compile a copy while in knoppix.

> Or is there a way to do such copies from a one disk to another while
> one disk is booted???

Sure, but the running disk/sector would have temporary files that would
not consistently hash when you did the hash check. If you do this, try
it in linux without bringing up X. This might avoid copying some software
"locks" that could block startup on the copied disk/sector.

HTH
 
Old 06-06-2010, 10:19 AM
Andrea Conti
 
Default Fast checksumming of whole partitions

> 1. boot up knoppix
> 2. create a partition: mkdir /work
> 3. mount /work to the root partition: mount /dev/sdc /work
> 4. cd /work/usr/bin
> 5. run dcfldd: ./dcfldd

This is fine, provided that

1- if the root partition is [part of] what you're copying, you *must*
mount it read-only (mount -o ro /dev/sdc /work)

2- the dcfldd executable is linked statically. If it uses dynamic
linking, your "live" system -- knoppix in this case -- must have exactly
the same library versions (especially glibc) as the gentoo system.

>> Or is there a way to do such copies from a one disk to another while
>> one disk is booted???

The point is not with being "booted" (i.e., part of the running system)
or not: you *cannot* reliably perform a sector-by-sector copy of any
write-mounted partition without special support either at the FS or
block device level (i.e. snapshots).

> Sure, but the running disk/sector would have temporary files that would
> not consistently hash when you did the hash check.

That is only a minor part of the problem. The real issue is that if
*anything* writes to the source partition while you are halfway through
copy, you risk ending up with inconsistencies in the filesystem
metadata. Doing a fsck on the copy will probably fix that, but you risk
losing or corrupting data.

And no, hashing as described in the previous post will *not* catch any
differences in this case, as the "source" hash is computed from what is
read during the copy (which, barring hardware problems, is what gets
written on the target disk) and not from the whole contents of the
source partition after the copy (or at any single point in time).

> If you do this, try it in linux without bringing up X.

That's definitely not enough: at the very least, boot up in single-user
mode and remount all your partitions read-only (mount -o remount,ro).
This will break things on a running system (e.g anything that writes to
/var and /tmp will throw errors or stop working), but it will allow you
to produce consistent partition images.

andrea
 
Old 06-06-2010, 04:55 PM
Mick
 
Default Fast checksumming of whole partitions

On Sunday 06 June 2010 11:19:57 Andrea Conti wrote:
> > 1. boot up knoppix
> > 2. create a partition: mkdir /work
> > 3. mount /work to the root partition: mount /dev/sdc /work
> > 4. cd /work/usr/bin
> > 5. run dcfldd: ./dcfldd
>
> This is fine, provided that
>
> 1- if the root partition is [part of] what you're copying, you *must*
> mount it read-only (mount -o ro /dev/sdc /work)
>
> 2- the dcfldd executable is linked statically. If it uses dynamic
> linking, your "live" system -- knoppix in this case -- must have exactly
> the same library versions (especially glibc) as the gentoo system.
>
> >> Or is there a way to do such copies from a one disk to another while
> >> one disk is booted???
>
> The point is not with being "booted" (i.e., part of the running system)
> or not: you *cannot* reliably perform a sector-by-sector copy of any
> write-mounted partition without special support either at the FS or
> block device level (i.e. snapshots).
>
> > Sure, but the running disk/sector would have temporary files that would
> > not consistently hash when you did the hash check.
>
> That is only a minor part of the problem. The real issue is that if
> *anything* writes to the source partition while you are halfway through
> copy, you risk ending up with inconsistencies in the filesystem
> metadata. Doing a fsck on the copy will probably fix that, but you risk
> losing or corrupting data.
>
> And no, hashing as described in the previous post will *not* catch any
> differences in this case, as the "source" hash is computed from what is
> read during the copy (which, barring hardware problems, is what gets
> written on the target disk) and not from the whole contents of the
> source partition after the copy (or at any single point in time).
>
> > If you do this, try it in linux without bringing up X.
>
> That's definitely not enough: at the very least, boot up in single-user
> mode and remount all your partitions read-only (mount -o remount,ro).
> This will break things on a running system (e.g anything that writes to
> /var and /tmp will throw errors or stop working), but it will allow you
> to produce consistent partition images.

It may be worth trying 'apt-get install dcfldd' after you su to root with
Knoppix. As long as Knoppix does not need a lorry load of dependencies you
may be able to quickly install the .deb binary you need and move on with the
task in hand.
--
Regards,
Mick
 
Old 06-06-2010, 06:55 PM
7v5w7go9ub0o
 
Default Fast checksumming of whole partitions

On 06/06/10 06:19, Andrea Conti wrote:
>> 1. boot up knoppix 2. create a partition: mkdir /work 3. mount
>> /work to the root partition: mount /dev/sdc /work 4. cd
>> /work/usr/bin 5. run dcfldd: ./dcfldd
>
> This is fine, provided that
>
> 1- if the root partition is [part of] what you're copying, you
> *must* mount it read-only (mount -o ro /dev/sdc /work)

Not from my experience; I simply mount, exec, and go - Works fine, be it
a partition or a disk copy (though it seems likely that the last access
dates would be changed if forensics is an issue).
>
> 2- the dcfldd executable is linked statically. If it uses dynamic
> linking, your "live" system -- knoppix in this case -- must have
> exactly the same library versions (especially glibc) as the gentoo
> system.

Good point. I've been using a contemporary Gentoo live disk and the
libraries happen to be compatible.................

# ldd /usr/bin/dcfldd
linux-vdso.so.1 => (0x00006cdd998b6000)
libc.so.6 => /lib/libc.so.6 (0x00006cdd99341000)
/lib64/ld-linux-x86-64.so.2 (0x00006cdd9969b000)



Based on this thread, I'll be running my backups from a
statically-linked copy of dcfldd on a "jumpdisk" (backup copy on the
boot sector).

- Any advice on the dd "blocksize" parameter?
 

Thread Tools




All times are GMT. The time now is 09:46 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org