FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > CentOS > CentOS

 
 
LinkBack Thread Tools
 
Old 12-21-2010, 04:33 PM
 
Default Text Proccessing script - advice?

> sort -t ','? -k 3,3 -k 4,4? file.log? # this will sort the file according to the DATE field as well as the Time fileld.
> I'm stuck for the last 30 min to find a way to get the first line of each day (logically it'll be the earliest as i've sorted by date/time previously) once i know how to do this, i'll be able to compare time and proceed..

If you're not afraid of perl, the Date-Manip module allows comparing time
and date, among other things.



---------------------------------------------------------------
This message and any attachments may contain Cypress (or its
subsidiaries) confidential information. If it has been received
in error, please advise the sender and immediately delete this
message.
---------------------------------------------------------------

_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 12-21-2010, 04:55 PM
Eduardo Grosclaude
 
Default Text Proccessing script - advice?

On Tue, Dec 21, 2010 at 2:33 PM, <lhecking@users.sourceforge.net> wrote:
> *If you're not afraid of perl, the Date-Manip module allows comparing time
> *and date, among other things.

A dirtier take could be

perl -ne '/,(d+),(.*),(dd):.*/ && ($3>=9) and $s->{$1,$2}++ ; END
{use Data:umper; print Dumper($s)}' < data
$VAR1 = {
'01368 2010-12-02' => 4,
'01368 2010-12-03' => 3
};

--
Eduardo Grosclaude
Universidad Nacional del Comahue
Neuquen, Argentina
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 12-21-2010, 04:58 PM
 
Default Text Proccessing script - advice?

Roland RoLaNd wrote:
>
> I have a log file with the following input:
> X , ID , Date, Time, Y
> 01,01368,2010-12-02,09:07:00,Pass
> 01,01368,2010-12-02,10:54:00,Pass
> 01,01368,2010-12-02,13:07:04,Pass
> 01,01368,2010-12-02,18:54:01,Pass
> 01,01368,2010-12-03,09:02:00,Pass
> 01,01368,2010-12-03,13:53:00,Pass
> 01,01368,2010-12-03,16:07:00,Pass
>
> My goal is to get the number of times ID has a TIME that's after 09:00:00
> each DATE.
> That would give me two output. one is the number of days ID has been late,
> and secondly, the day and time this ID has been late .
>
awk 'BEGIN { FS=",";}
{ if ( $4 > "09:00:00" ) {
array[ $2 ][1]++;
array[ $2 ][ array[$2][1] + 1] = $3 "::" $4; }
}
END {
for j in array {
for k in array[j] {
print j, array[j][k];
}
}
}

It's been a while since I needed to do this, but I *think* the nested "for
<var> in array" will work.
<snip>
mark



_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 12-21-2010, 05:30 PM
Roland RoLaNd
 
Default Text Proccessing script - advice?

First of all i'd like to appologize for those who helped me by giving an advice using "perl" i'm ashamed to say that i have no experience with it.

Mark, thanks for your effort in writing the below though could you help me understand how it goes ? the best way to do thigns, is to learn them for future references.

I'm no expert with AWK, so i need your help with the below if possible:



awk 'BEGIN { FS=",";} ## awk -f begin triggers the afterwords commands to be executed in awk, with , as field delimiter
{ if ( $4 > "09:00:00" ) { # condition that matched 09 am
array[ $2 ][1]++; # incrementing count by one though im a bit at a loss with "array"
array[ $2 ][ array[$2][1] + 1] = $3 "::" $4; } # couldn't figure it out
}
END {
for j in array {
for k in array[j] {
print j, array[j][k]; # prints out what exactly?
}
}
}

----------------------------------------
> Date: Tue, 21 Dec 2010 12:58:33 -0500
> From: m.roth@5-cent.us
> To: centos@centos.org
> Subject: Re: [CentOS] Text Proccessing script - advice?
>
> Roland RoLaNd wrote:
> >
> > I have a log file with the following input:
> > X , ID , Date, Time, Y
> > 01,01368,2010-12-02,09:07:00,Pass
> > 01,01368,2010-12-02,10:54:00,Pass
> > 01,01368,2010-12-02,13:07:04,Pass
> > 01,01368,2010-12-02,18:54:01,Pass
> > 01,01368,2010-12-03,09:02:00,Pass
> > 01,01368,2010-12-03,13:53:00,Pass
> > 01,01368,2010-12-03,16:07:00,Pass
> >
> > My goal is to get the number of times ID has a TIME that's after 09:00:00
> > each DATE.
> > That would give me two output. one is the number of days ID has been late,
> > and secondly, the day and time this ID has been late .
> >
> awk 'BEGIN { FS=",";}
> { if ( $4 > "09:00:00" ) {
> array[ $2 ][1]++;
> array[ $2 ][ array[$2][1] + 1] = $3 "::" $4; }
> }
> END {
> for j in array {
> for k in array[j] {
> print j, array[j][k];
> }
> }
> }
>
> It's been a while since I needed to do this, but I *think* the nested "for
> in array" will work.
>
> mark
>
>
>
> _______________________________________________
> CentOS mailing list
> CentOS@centos.org
> http://lists.centos.org/mailman/listinfo/centos

_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 12-21-2010, 05:43 PM
Les Mikesell
 
Default Text Proccessing script - advice?

On 12/21/2010 11:30 AM, Roland RoLaNd wrote:
>
> Hello,
>
> I have a log file with the following input:
> X , ID , Date, Time, Y
> 01,01368,2010-12-02,09:07:00,Pass
> 01,01368,2010-12-02,10:54:00,Pass
> 01,01368,2010-12-02,13:07:04,Pass
> 01,01368,2010-12-02,18:54:01,Pass
> 01,01368,2010-12-03,09:02:00,Pass
> 01,01368,2010-12-03,13:53:00,Pass
> 01,01368,2010-12-03,16:07:00,Pass
>
> My goal is to get the number of times ID has a TIME that's after 09:00:00 each DATE.
> That would give me two output. one is the number of days ID has been late, and secondly, the day and time this ID has been late .
>
> I've started as such:
>
> sort -t ',' -k 3,3 -k 4,4 file.log # this will sort the file according to the DATE field as well as the Time fileld.
> I'm stuck for the last 30 min to find a way to get the first line of each day (logically it'll be the earliest as i've sorted by date/time previously) once i know how to do this, i'll be able to compare time and proceed..
>
> Can any one help ?
> i looked into sort - u and uniq -f3 though i didnt get far with it..

Most logs are written in append mode so ascending date/time comes
naturally. This perl should list each instance and the count:

my %id_count;
my %id_date; #date already seen;
while (<>) {
my ($x,$id,$date,$time) = split /,/;
next if ($x == 'X'); #skip header
next if ($time le "09:00:00");
next if ($id_date{$id} eq $date);
$id_date{$id} = $date;
print "$id - $date - $time
";
$id_count{$id}++;
}
print "----
";
while (( my $id,$count) = each(%id_count)) {
print "$id late $count days
";
}


--
Les Mikesell
lesmikesell@gmail.com
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 12-21-2010, 06:14 PM
John Lundin
 
Default Text Proccessing script - advice?

On Tue, Dec 21, 2010 at 08:30:43PM +0200, Roland RoLaNd wrote:

(chuckle) That's a bit more verbose than necessary. As a one-liner:

awk -F, '($4>"09:00:00"){c[$2 "," $3]++};END{for (i in c){print i "," c[i]}}' $filename

01368,2010-12-02,4
01368,2010-12-03,3

(You might check if you want >="09:00:00", and include the edge case.)

-F, # set separator to comma

# (automatic loop over all data lines)
($4>"09:00:00"){ # do if fourth field greater than 09:...
c[$2 "," $3]++ # increment hash element pointed to by
# second and third fields separated by comma
# (that is, hash on id,date)

END{ # after finishing the data
for (i in c){ # for each observed hash value in array c
print i "," c[i] # print the hash value, comma, count

--
lundin@fini.net
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 12-21-2010, 06:35 PM
 
Default Text Proccessing script - advice?

John Lundin wrote:
> On Tue, Dec 21, 2010 at 08:30:43PM +0200, Roland RoLaNd wrote:
>
> (chuckle) That's a bit more verbose than necessary. As a one-liner:
>
> awk -F, '($4>"09:00:00"){c[$2 "," $3]++};END{for (i in c){print i ","
> c[i]}}' $filename
>

Well, yes, but he also wanted a count....

mark

> 01368,2010-12-02,4
> 01368,2010-12-03,3
>
> (You might check if you want >="09:00:00", and include the edge case.)
>
> -F, # set separator to comma
>
> # (automatic loop over all data lines)
> ($4>"09:00:00"){ # do if fourth field greater than 09:...
> c[$2 "," $3]++ # increment hash element pointed to by
> # second and third fields separated by comma
> # (that is, hash on id,date)
>
> END{ # after finishing the data
> for (i in c){ # for each observed hash value in array c
> print i "," c[i] # print the hash value, comma, count
>
> --
> lundin@fini.net
> _______________________________________________
> CentOS mailing list
> CentOS@centos.org
> http://lists.centos.org/mailman/listinfo/centos
>


_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 12-21-2010, 06:40 PM
Roland RoLaNd
 
Default Text Proccessing script - advice?

Thanks to your help i've reached this step:

original data:

01,01368,2010-12-02,09:07:00,Pass
01,01368,2010-12-02,10:54:00,Pass
01,01368,2010-12-02,13:07:04,Pass
01,01368,2010-12-02,18:54:01,Pass
01,01368,2010-12-03,09:02:00,Pass
01,01368,2010-12-03,13:53:00,Pass
01,01368,2010-12-03,16:07:00,Pass




awk -F , '{if ($4 > "09:10:00") print $2 " was late on", $3 " by coming at ",$4}' test | tee* DaysLate ; wc -l DaysLate

01368 was late on 2010-12-02 by coming at* 10:54:00

01368 was late on 2010-12-02 by coming at* 13:07:04

01368 was late on 2010-12-02 by coming at* 18:54:01

01368 was late on 2010-12-03 by coming at* 13:53:00

01368 was late on 2010-12-03 by coming at* 16:07:00

****** 5 DaysLate


the only thing missing is to find a way to just take the earliest time of each day.

in other words the above output should be:


***** 0 DaysLate # as on 12-02 he came in at 09:07 which is before 09:10 and on 12-03 he came in at 09:02 which is also before the set time




----------------------------------------
> Date: Tue, 21 Dec 2010 14:35:13 -0500
> From: m.roth@5-cent.us
> To: centos@centos.org
> Subject: Re: [CentOS] Text Proccessing script - advice?
>
> John Lundin wrote:
> > On Tue, Dec 21, 2010 at 08:30:43PM +0200, Roland RoLaNd wrote:
> >
> > (chuckle) That's a bit more verbose than necessary. As a one-liner:
> >
> > awk -F, '($4>"09:00:00"){c[$2 "," $3]++};END{for (i in c){print i ","
> > c[i]}}' $filename
> >
>
> Well, yes, but he also wanted a count....
>
> mark
>
> > 01368,2010-12-02,4
> > 01368,2010-12-03,3
> >
> > (You might check if you want >="09:00:00", and include the edge case.)
> >
> > -F, # set separator to comma
> >
> > # (automatic loop over all data lines)
> > ($4>"09:00:00"){ # do if fourth field greater than 09:...
> > c[$2 "," $3]++ # increment hash element pointed to by
> > # second and third fields separated by comma
> > # (that is, hash on id,date)
> >
> > END{ # after finishing the data
> > for (i in c){ # for each observed hash value in array c
> > print i "," c[i] # print the hash value, comma, count
> >
> > --
> > lundin@fini.net
> > _______________________________________________
> > CentOS mailing list
> > CentOS@centos.org
> > http://lists.centos.org/mailman/listinfo/centos
> >
>
>
> _______________________________________________
> CentOS mailing list
> CentOS@centos.org
> http://lists.centos.org/mailman/listinfo/centos

_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 12-21-2010, 06:54 PM
Les Mikesell
 
Default Text Proccessing script - advice?

On 12/21/2010 1:40 PM, Roland RoLaNd wrote:
>
>
>
> awk -F , '{if ($4> "09:10:00") print $2 " was late on", $3 " by coming at ",$4}' test | tee DaysLate ; wc -l DaysLate
>
> 01368 was late on 2010-12-02 by coming at 10:54:00
>
> 01368 was late on 2010-12-02 by coming at 13:07:04
>
> 01368 was late on 2010-12-02 by coming at 18:54:01
>
> 01368 was late on 2010-12-03 by coming at 13:53:00
>
> 01368 was late on 2010-12-03 by coming at 16:07:00
>
> 5 DaysLate

On my calendar 12-02 and 12-03 are only 2 days...

--
Les Mikesell
lesmikesell@gmail.com



_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 12-21-2010, 06:58 PM
Roland RoLaNd
 
Default Text Proccessing script - advice?

Exactly, hence:

Quote:
the only thing missing is to find a way to just take the earliest time of each day.

in other words the above output should be:


***** 0 DaysLate

----------------------------------------
> Date: Tue, 21 Dec 2010 13:54:41 -0600
> From: lesmikesell@gmail.com
> To: centos@centos.org
> Subject: Re: [CentOS] Text Proccessing script - advice?
>
> On 12/21/2010 1:40 PM, Roland RoLaNd wrote:
> >
> >
> >
> > awk -F , '{if ($4> "09:10:00") print $2 " was late on", $3 " by coming at ",$4}' test | tee DaysLate ; wc -l DaysLate
> >
> > 01368 was late on 2010-12-02 by coming at 10:54:00
> >
> > 01368 was late on 2010-12-02 by coming at 13:07:04
> >
> > 01368 was late on 2010-12-02 by coming at 18:54:01
> >
> > 01368 was late on 2010-12-03 by coming at 13:53:00
> >
> > 01368 was late on 2010-12-03 by coming at 16:07:00
> >
> > 5 DaysLate
>
> On my calendar 12-02 and 12-03 are only 2 days...
>
> --
> Les Mikesell
> lesmikesell@gmail.com
>
>
>
> _______________________________________________
> CentOS mailing list
> CentOS@centos.org
> http://lists.centos.org/mailman/listinfo/centos

_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 

Thread Tools




All times are GMT. The time now is 04:17 PM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org