FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Debian > Debian Development

 
 
LinkBack Thread Tools
 
Old 03-19-2009, 10:55 AM
Thomas Koch
 
Default Bug#520401: ITP: simhash -- generate similarity hashes to find nearly duplicate files

Package: wnpp
Severity: wishlist
Owner: Thomas Koch <thomas@koch.ro>


* Package name : simhash
Version : only GIT, no releases
Upstream Author : Bart Massey
* URL : http://wiki.cs.pdx.edu/forge/simhash.html
* License : BSD
Programming Lang: C
Description : generate similarity hashes to find nearly duplicate files
One of the questions that it's nice to be able to answer about a pair of files
is the degree of similarity between them. This command-line tool is useful for
estimating the "degree of similarity" between a pair of nominally sequential
files such as textfiles. The tool uses Manassas's "shingleprinting" technique;



--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
 

Thread Tools




All times are GMT. The time now is 08:35 PM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org