FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > CentOS > CentOS

 
 
LinkBack Thread Tools
 
Old 12-02-2009, 08:56 AM
hadi motamedi
 
Default Inquiry:How to compare two files but not in line-by-line basis?

But "#diff -y" compares the two files in line-by-line basis . But my two files do not have one-to-one correspondence , say row#1 in file1 maybe the same as say row#5 in file2 . So I seek a way that does not consider this as a difference (but diff will consider).



*
On Wed, Dec 2, 2009 at 9:47 AM, Brian McKerr <bmckerr@gmail.com> wrote:

diff -y ?





On Wed, Dec 2, 2009 at 7:42 PM, Simon Banton <centos@web.org.uk> wrote:




At 08:54 +0000 2/12/09, hadi motamedi wrote:
>Dear All
>Can you please do me favor and let me know how can I compare two
>files but not in line-by-line basis on my CentOS server ? I mean say
>row#1 in file1 has the same data as say row#5 in file2 , but the

>comm compares them in line-by-line basis that is not intended . It
>seems that the diff cannot do the job as well

This'll show you which lines are common to both files, and for the
ones that aren't which file they're in.


perl -MData:umper -le 'while(<>) {chomp; push @{$s->{"$_"}},
$ARGV}; END{ print Dumper($s) }' file1 file2

... someone will be along shortly with a more elegant method.

HTH


S.
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos



_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos




_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 12-02-2009, 09:01 AM
Paul Bijnens
 
Default Inquiry:How to compare two files but not in line-by-line basis?

On 2009-12-02 10:56, hadi motamedi wrote:
> But "#diff -y" compares the two files in line-by-line basis . But my two
> files do not have one-to-one correspondence , say row#1 in file1 maybe
> the same as say row#5 in file2 . So I seek a way that does not consider
> this as a difference (but diff will consider).

(( First, please do not top-post. ))

"diff" would match the line2 in file1 with the line5 in file2,
and it would mark that some lines were inserted there.

I think you'll have to specify more what you mean by "compare",
and what you think is different or same.




--
Paul Bijnens, Xplanation Technology Services Tel +32 16 397.525
Interleuvenlaan 86, B-3001 Leuven, BELGIUM Fax +32 16 397.552
************************************************** *********************
* I think I've got the hang of it now: exit, ^D, ^C, ^, ^Z, ^Q, ^^, *
* quit, ZZ, :q, :q!, M-Z, ^X^C, logoff, logout, close, bye, /bye, ~., *
* stop, end, ^]c, +++ ATH, disconnect, halt, abort, hangup, KJOB, *
* ^X^X, :, kill -9 1, kill -1 $$, shutdown, init 0, Alt-F4, *
* Alt-f-e, Ctrl-Alt-Del, Alt-SysRq-reisub, Stop-A, AltGr-NumLock, ... *
* ... "Are you sure?" ... YES ... Phew ... I'm out *
************************************************** *********************
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 12-02-2009, 09:10 AM
hadi motamedi
 
Default Inquiry:How to compare two files but not in line-by-line basis?

Thank you very much for your reply . This code actually solved my problem and returned exact matches between the two files (irrespective of their location in the files) . As I understood , it will list each data showing to which file it belongs (or it is common to both files) . It is really what I wanted .

Sincerely Yours


*
On Wed, Dec 2, 2009 at 9:42 AM, Simon Banton <centos@web.org.uk> wrote:




At 08:54 +0000 2/12/09, hadi motamedi wrote:
>Dear All
>Can you please do me favor and let me know how can I compare two
>files but not in line-by-line basis on my CentOS server ? I mean say

>row#1 in file1 has the same data as say row#5 in file2 , but the
>comm compares them in line-by-line basis that is not intended . It
>seems that the diff cannot do the job as well

This'll show you which lines are common to both files, and for the

ones that aren't which file they're in.

perl -MData:umper -le 'while(<>) {chomp; push @{$s->{"$_"}},
$ARGV}; END{ print Dumper($s) }' file1 file2

... someone will be along shortly with a more elegant method.


HTH

S.
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos



_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 12-02-2009, 09:16 AM
Paul Bijnens
 
Default Inquiry:How to compare two files but not in line-by-line basis?

On 2009-12-02 11:10, hadi motamedi wrote:
> Thank you very much for your reply . This code actually solved my
> problem and returned exact matches between the two files (irrespective
> of their location in the files) . As I understood , it will list each
> data showing to which file it belongs (or it is common to both files) .
> It is really what I wanted .

(( do not top-post ))

You could do the same by first sorting the two files, and then use "comm".


--
Paul Bijnens, Xplanation Technology Services Tel +32 16 397.525
Interleuvenlaan 86, B-3001 Leuven, BELGIUM Fax +32 16 397.552
************************************************** *********************
* I think I've got the hang of it now: exit, ^D, ^C, ^, ^Z, ^Q, ^^, *
* quit, ZZ, :q, :q!, M-Z, ^X^C, logoff, logout, close, bye, /bye, ~., *
* stop, end, ^]c, +++ ATH, disconnect, halt, abort, hangup, KJOB, *
* ^X^X, :, kill -9 1, kill -1 $$, shutdown, init 0, Alt-F4, *
* Alt-f-e, Ctrl-Alt-Del, Alt-SysRq-reisub, Stop-A, AltGr-NumLock, ... *
* ... "Are you sure?" ... YES ... Phew ... I'm out *
************************************************** *********************
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 12-02-2009, 09:20 AM
hadi motamedi
 
Default Inquiry:How to compare two files but not in line-by-line basis?

On Wed, Dec 2, 2009 at 10:01 AM, Paul Bijnens <Paul.Bijnens@xplanation.com> wrote:


On 2009-12-02 10:56, hadi motamedi wrote:
> But "#diff -y" compares the two files in line-by-line basis . But my two
> files do not have one-to-one correspondence , say row#1 in file1 maybe

> the same as say row#5 in file2 . So I seek a way that does not consider
> this as a difference (but diff will consider).

(( First, please do not top-post. ))

"diff" would match the line2 in file1 with the line5 in file2,

and it would mark that some lines were inserted there.

I think you'll have to specify more what you mean by "compare",
and what you think is different or same.




--
Paul Bijnens, Xplanation Technology Services * * * *Tel *+32 16 397.525

Interleuvenlaan 86, B-3001 Leuven, BELGIUM * * * * *Fax *+32 16 397.552
************************************************** *********************
* I think I've got the hang of it now: *exit, ^D, ^C, ^, ^Z, ^Q, ^^, *

* quit, ZZ, :q, :q!, M-Z, ^X^C, logoff, logout, close, bye, /bye, ~., *
* stop, end, ^]c, +++ ATH, disconnect, *halt, *abort, *hangup, *KJOB, *
* ^X^X, *:, *kill -9 1, *kill -1 $$, *shutdown, *init 0, *Alt-F4, *

* Alt-f-e, Ctrl-Alt-Del, Alt-SysRq-reisub, Stop-A, AltGr-NumLock, ... *
* ... *"Are you sure?" *... * YES * ... * Phew ... * I'm out * * * * **
************************************************** *********************




_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos



*
*
*
Sorry . I tried for "#diff -y" but its output seems to have a comparison between the two files in line-by-line basis . As you mentioned , if the row#1 in file1 is in match with say row#5 in file2 I want it*not to be*considered as a difference. But the the output shows it as if it is being considered as a difference. Please correct me .

*
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 12-02-2009, 09:24 AM
hadi motamedi
 
Default Inquiry:How to compare two files but not in line-by-line basis?

On Wed, Dec 2, 2009 at 10:16 AM, Paul Bijnens <Paul.Bijnens@xplanation.com> wrote:


On 2009-12-02 11:10, hadi motamedi wrote:
> Thank you very much for your reply . This code actually solved my
> problem and returned exact matches between the two files (irrespective
> of their location in the files) . As I understood , it will list each

> data showing to which file it belongs (or it is common to both files) .
> It is really what I wanted .

(( do not top-post ))

You could do the same by first sorting the two files, and then use "comm".




--
Paul Bijnens, Xplanation Technology Services * * * *Tel *+32 16 397.525
Interleuvenlaan 86, B-3001 Leuven, BELGIUM * * * * *Fax *+32 16 397.552
************************************************** *********************

* I think I've got the hang of it now: *exit, ^D, ^C, ^, ^Z, ^Q, ^^, *
* quit, ZZ, :q, :q!, M-Z, ^X^C, logoff, logout, close, bye, /bye, ~., *
* stop, end, ^]c, +++ ATH, disconnect, *halt, *abort, *hangup, *KJOB, *

* ^X^X, *:, *kill -9 1, *kill -1 $$, *shutdown, *init 0, *Alt-F4, *
* Alt-f-e, Ctrl-Alt-Del, Alt-SysRq-reisub, Stop-A, AltGr-NumLock, ... *
* ... *"Are you sure?" *... * YES * ... * Phew ... * I'm out * * * * **

************************************************** *********************
_______________________________________________



CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos



*
The two files are assorted ones , but the comm will compare them in line-by-line basis . If row#1 in file1 is equal to say row#5 in file2 , so I want it not to being considered as a difference .
*
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 12-02-2009, 10:23 AM
John Doe
 
Default Inquiry:How to compare two files but not in line-by-line basis?

From: hadi motamedi <motamedi24@gmail.com>
>Sorry . I tried for "#diff -y" but its output seems to have a comparison between the two files in line-by-line basis . As you mentioned , if the row#1 in file1 is in match with say row#5 in file2 I want it not to be considered as a difference. But the the output shows it as if it is being considered as a difference. Please correct me .

Could you be more precise when you say "compare"...?
By example, to get matching lines, you could:

cat $FILE1 $FILE2 | sort | uniq -c | ...

You'd get each line preceded by the number of occurence; then grep what you want...

JD



_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 12-03-2009, 04:49 AM
hadi motamedi
 
Default Inquiry:How to compare two files but not in line-by-line basis?

*
On Wed, Dec 2, 2009 at 11:23 AM, John Doe <jdmls@yahoo.com> wrote:

From: hadi motamedi <motamedi24@gmail.com>

>Sorry . I tried for "#diff -y" but its output seems to have a comparison between the two files in line-by-line basis . As you mentioned , if the row#1 in file1 is in match with say row#5 in file2 I want it not to be considered as a difference. But the the output shows it as if it is being considered as a difference. Please correct me .


Could you be more precise when you say "compare"...?
By example, to get matching lines, you could:

*cat $FILE1 $FILE2 | sort | uniq -c | ...

You'd get each line preceded by the number of occurence; then grep what you want...


JD






_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos



*
*

Thank you very much for your reply . Please be informed that I tried to compare the files with your proposed code , as the followings :
#cat Edit3 Edit4 |sort |uniq -c
It is returning the same count on matches as I got from the following code :
#perl -MData:umper -le 'while(<>) {chomp; push @{$s->{"$_"}},$ARGV}; END{ print Dumper($s) }' Edit3 Edit4
But it is easier to be used . Can you please do me favor and let me know if I can go further and try for advanced search like finding how many rows inside a file have data that does not start with a zero after the third comma ?

Sincerely Yours
*
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 12-03-2009, 09:20 AM
John Doe
 
Default Inquiry:How to compare two files but not in line-by-line basis?

From: hadi motamedi <motamedi24@gmail.com>
> Can you please do me favor and let me know if I can go further and try for advanced search like finding how many rows inside a file have data that does not start with a zero after the third comma ?

Something like:
awk -F, ' { print $4 } ' | grep -v "^0" | wc -l
Use one command at a time to see how they work with each other (you might have to modify the grep a bit)...

JD



_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 12-03-2009, 11:42 AM
mark
 
Default Inquiry:How to compare two files but not in line-by-line basis?

John Doe wrote:
> From: hadi motamedi <motamedi24@gmail.com>
>> Can you please do me favor and let me know if I can go further and try for
>> advanced search like finding how many rows inside a file have data that
>> does not start with a zero after the third comma ?
>
> Something like: awk -F, ' { print $4 } ' | grep -v "^0" | wc -l Use one
> command at a time to see how they work with each other (you might have to
> modify the grep a bit)...

*sigh*

Drive me crazy, why use multiple commands?

awk -F 'BEGIN { FS = ","; }{if ( $3 !~ /^0 ) { count++; }} END { print count }'
filename

mark "why, yes, since you ask, I *have* written 100 and
200 line awk scripts"
--
Though I don't think (object-oriented programming) has much to offer good
programmers, except in certain specialized domains, it is irresistible to
large organizations. Object-oriented programming offers a sustainable way
to write spaghetti code. - Paul Graham
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 

Thread Tools




All times are GMT. The time now is 09:26 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org