Linux Archive

Linux Archive (http://www.linux-archive.org/)
-   Debian User (http://www.linux-archive.org/debian-user/)
-   -   "need a quick hashing method" (http://www.linux-archive.org/debian-user/455956-need-quick-hashing-method.html)

Arthur Bela 11-22-2010 11:17 AM

"need a quick hashing method"
 
HDD#1
HDD#2

I copy files between HDD#1 and HDD#2.

When i finish, i need a quick "hasing method" - i just want to check,
that the copy was 100% ok.

md5sum, sha256sum is slow -> are there any "very fast" hash algoritms?
- just for checking if the copied file is corrupt or not [i just need
to know, if there were even 1 Byte error when copying]

i can write the script, that checks the files, i just need a hash
algoritm [""software/command""], that is fast enough.

Thank you!

--
ubuntu-users mailing list
ubuntu-users@lists.ubuntu.com
Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-users

Arthur Bela 11-22-2010 11:17 AM

"need a quick hashing method"
 
HDD#1
HDD#2

I copy files between HDD#1 and HDD#2.

When i finish, i need a quick "hasing method" - i just want to check,
that the copy was 100% ok.

md5sum, sha256sum is slow -> are there any "very fast" hash algoritms?
- just for checking if the copied file is corrupt or not [i just need
to know, if there were even 1 Byte error when copying]

i can write the script, that checks the files, i just need a hash
algoritm [""software/command""], that is fast enough.

Thank you!


--
To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: AANLkTinzUyqRsDJPyDg+gVL_SvS1ubz795FUD7k=JgYT@mail .gmail.com">http://lists.debian.org/AANLkTinzUyqRsDJPyDg+gVL_SvS1ubz795FUD7k=JgYT@mail .gmail.com

"Robert P. J. Day" 11-22-2010 11:27 AM

"need a quick hashing method"
 
On Mon, 22 Nov 2010, Arthur Bela wrote:

> HDD#1
> HDD#2
>
> I copy files between HDD#1 and HDD#2.
>
> When i finish, i need a quick "hasing method" - i just want to check,
> that the copy was 100% ok.
>
> md5sum, sha256sum is slow -> are there any "very fast" hash algoritms?
> - just for checking if the copied file is corrupt or not [i just need
> to know, if there were even 1 Byte error when copying]
>
> i can write the script, that checks the files, i just need a hash
> algoritm [""software/command""], that is fast enough.

well, "sum" is fairly simple. not really robust but it should be
adequate for checking for inadvertant corruption.

rday

--

================================================== ======================
Robert P. J. Day Waterloo, Ontario, CANADA
http://crashcourse.ca

Twitter: http://twitter.com/rpjday
LinkedIn: http://ca.linkedin.com/in/rpjday
================================================== ======================

--
ubuntu-users mailing list
ubuntu-users@lists.ubuntu.com
Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-users

Loïc Grenié 11-22-2010 11:58 AM

"need a quick hashing method"
 
2010/11/22 Arthur Bela <jozsi.avadkan@gmail.com>:
> HDD#1
> HDD#2
>
> I copy files between HDD#1 and HDD#2.
>
> When i finish, i need a quick "hasing method" - i just want to check,
> that the copy was 100% ok.
>
> md5sum, sha256sum is slow -> are there any "very fast" hash algoritms?
> - just for checking if the copied file is corrupt or not [i just need
> to know, if there were even 1 Byte error when copying]
>
> i can write the script, that checks the files, i just need a hash
> algoritm [""software/command""], that is fast enough.

Since you want to verify everything, the best you can do is just
to use diff. No hashing, just reading the data and comparing
(which is more than 10000 times faster than the reading on the
disks, thus negligible).

Hope this helps,

Loïc

--
ubuntu-users mailing list
ubuntu-users@lists.ubuntu.com
Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-users

Paul Tader 11-22-2010 12:28 PM

"need a quick hashing method"
 
On Nov 22, 2010, at 6:58 AM, Loïc Grenié <loic.grenie@gmail.com>
wrote:

> 2010/11/22 Arthur Bela <jozsi.avadkan@gmail.com>:
>> HDD#1
>> HDD#2
>>
>> I copy files between HDD#1 and HDD#2.
>>
>> When i finish, i need a quick "hasing method" - i just want to check,
>> that the copy was 100% ok.
>>
>> md5sum, sha256sum is slow -> are there any "very fast" hash
>> algoritms?
>> - just for checking if the copied file is corrupt or not [i just need
>> to know, if there were even 1 Byte error when copying]
>>
>> i can write the script, that checks the files, i just need a hash
>> algoritm [""software/command""], that is fast enough.
>
> Since you want to verify everything, the best you can do is just
> to use diff. No hashing, just reading the data and comparing
> (which is more than 10000 times faster than the reading on the
> disks, thus negligible).
>
> Hope this helps,
>
> Loïc
>
> --
> ubuntu-users mailing list
> ubuntu-users@lists.ubuntu.com
> Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-users

Or you might want to use the rsync command and it's checksum (-c)
switch.


--
ubuntu-users mailing list
ubuntu-users@lists.ubuntu.com
Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-users

John Hasler 11-22-2010 12:32 PM

"need a quick hashing method"
 
Arthur writes:
> I copy files between HDD#1 and HDD#2.

> When i finish, i need a quick "hasing method" - i just want to check,
> that the copy was 100% ok.

Use rsync. It does checksums.
--
John Hasler


--
To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 87r5edpjqk.fsf@thumper.dhh.gt.org">http://lists.debian.org/87r5edpjqk.fsf@thumper.dhh.gt.org

George 11-22-2010 12:37 PM

"need a quick hashing method"
 
On Mon, Nov 22, 2010 at 2:17 PM, Arthur Bela <jozsi.avadkan@gmail.com> wrote:
> HDD#1
> HDD#2
>
> I copy files between HDD#1 and HDD#2.
>
> When i finish, i need a quick "hasing method" - i just want to check,
> that the copy was 100% ok.
>

Comparing the hashes of two files is not enough to be certain that
they have identical contents. The only way to be, as you put it, 100%
sure is to check byte by byte, which will be much slower than hashing.


--
To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: AANLkTimvHhFNCHOSF+3agohni_w7_meUuH9nYsynaZZJ@mail .gmail.com">http://lists.debian.org/AANLkTimvHhFNCHOSF+3agohni_w7_meUuH9nYsynaZZJ@mail .gmail.com

Jochen Schulz 11-22-2010 01:37 PM

"need a quick hashing method"
 
Arthur Bela:
>
> I copy files between HDD#1 and HDD#2.
>
> When i finish, i need a quick "hasing method" - i just want to check,
> that the copy was 100% ok.

Why do you want to hash? Hashing implies reading both trees completely,
computing hashes and comparing these hashes. It might be faster to just
compare contents directly, without the intermediate checksum. I would
benchmark 'rsync -C' against 'diff -r' and use the faster option.

J.
--
I will not admit to failure even when I know I am terribly mistaken and
have offended others.
[Agree] [Disagree]
<http://www.slowlydownward.com/NODATA/data_enter2.html>

Rashkae 11-22-2010 02:03 PM

"need a quick hashing method"
 
On 10-11-22 07:17 AM, Arthur Bela wrote:
> HDD#1
> HDD#2
>
> I copy files between HDD#1 and HDD#2.
>
> When i finish, i need a quick "hasing method" - i just want to check,
> that the copy was 100% ok.
>
> md5sum, sha256sum is slow -> are there any "very fast" hash algoritms?
> - just for checking if the copied file is corrupt or not [i just need
> to know, if there were even 1 Byte error when copying]
>
>

md5sum slow??? Are you trying to do this on a 386?

No matter how fast a hashing algorithm you use, if you are trying to
verify that not a single byte is corrupt, you need to read every byte
from the hard drive. The only way you will make this go faster is to
use a faster hard drive (or start experimenting with Raid 0 / Raid 5 Arrays)


--
ubuntu-users mailing list
ubuntu-users@lists.ubuntu.com
Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-users

"Boyd Stephen Smith Jr." 11-27-2010 09:07 PM

"need a quick hashing method"
 
In <AANLkTinzUyqRsDJPyDg+gVL_SvS1ubz795FUD7k=JgYT@mai l.gmail.com>, Arthur Bela
wrote:
>HDD#1
>HDD#2
>
>I copy files between HDD#1 and HDD#2.
>
>When i finish, i need a quick "hasing method" - i just want to check,
>that the copy was 100% ok.

Hashing is not 100%. It's probabilistic. If your hash results is N bits,
your data longer than N bits, and your hash method is uniform, you'll get
roughly (1 - 1/2^N) accuracy. For MD5 and SHA1 N = 128, so you'll get
99.99999999999999999999999999999999999971% accuracy.

>md5sum, sha256sum is slow -> are there any "very fast" hash algoritms?

Not that are cryptographically secure. CRC32 and whatever the UNIX sum
program does can catch errors, but are easily deceived by an attacker.

>- just for checking if the copied file is corrupt or not [i just need
>to know, if there were even 1 Byte error when copying]
>
>i can write the script, that checks the files, i just need a hash
>algoritm [""software/command""], that is fast enough.

The slowest part of any such script is not the hash function, it's all the I/O
to the disks. Direct byte-by-byte comparison will do the same amount of I/O
(reading every byte of every file), and spend no time in a hash routine.
There will be a much larger number of comparisons done, but these are very
fast compared to the machinations of even a simple hash router.

A recursive diff will probably be faster than whatever script you might be
writing.
--
Boyd Stephen Smith Jr. ,= ,-_-. =.
bss@iguanasuicide.net ((_/)o o(\_))
ICQ: 514984 YM/AIM: DaTwinkDaddy `-'(. .)`-'
http://iguanasuicide.net/ \_/


All times are GMT. The time now is 09:33 PM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.