FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.

» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Debian > Debian User

LinkBack Thread Tools
Old 08-04-2008, 12:26 PM
Vincent Lefevre
Default diffing two large compressed (.bz2 or .lzma) files?

Is there a utility that can efficiently output the differences between
two large compressed files? Note: one can assume that the compressed
files just differ in a few places, so that the utility MUST NOT take
more than a few megabytes (whether in RAM, swap or disk).

bzdiff (from the bzip2 package) first decompresses one of the file to
a temporary file, thus is not a solution (it filled up my partition!).

I've also tried process substitution (with zsh, but this is also
supported by bash):

diff <(bunzip2 -c file1.bz2) <(bunzip2 -c file2.bz2)


diff --speed-large-files <(bunzip2 -c file1.bz2) <(bunzip2 -c file2.bz2)

but in both cases, diff takes too much swap (I think the problem with
process substitution is that diff cannot control how the files are
decompressed, but perhaps diff doesn't cope well with this either).

I've taken the example of .bz2, but I may switch to lzma. So, I'm
interested in possibilities for both.

Vincent Lefèvre <vincent@vinc17.org> - Web: <http://www.vinc17.org/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.org/blog/>
Work: CR INRIA - computer arithmetic / Arenaire project (LIP, ENS-Lyon)

To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org

Thread Tools

All times are GMT. The time now is 09:26 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright ©2007 - 2008, www.linux-archive.org