is a nice "place" :-D
After attempting to install for the first time last week, I started 3
different threads here looking for help. I'm pleased with the nature of the responses, and being able to succeed eventually using a mix of those responses and my own efforts digging into Google, gentoo.org and cranial cobwebs. So, thanks to all who replied, and even to those who showed interest without replying. For http://fm.no-ip.com/Tmp/Linux/G/, newly created to use with those three threads, 'cat /var/log/apache2/access_log | grep "GET /Tmp/Linux/G" | grep -v <myip> | sort > outfile' generated 117 lines. That's a lot more hits than I can ever remember getting before when asking for help from a mailing list (even if it did take 5 days to accumulate so many). I'm curious if anyone here would like to offer a better variant of my local query that would limit the hit count so that no more than one hit per IP is represented in the output? My skill with such things is very limited. I can't think of the the name of a command to cut the IP off the front of each line, much less how to compare if it's a non-first instance to be discarded. Or, maybe there's an Apache utility for doing this that I just don't know about? -- "The wise are known for their understanding, and pleasant words are persuasive." Proverbs 16:21 (New Living Translation) Team OS/2 ** Reg. Linux User #211409 ** a11y rocks! Felix Miata *** http://fm.no-ip.com/ |
is a nice "place" :-D
Apparently, though unproven, at 01:10 on Tuesday 17 May 2011, Felix Miata did
opine thusly: > After attempting to install for the first time last week, I started 3 > different threads here looking for help. I'm pleased with the nature of the > responses, and being able to succeed eventually using a mix of those > responses and my own efforts digging into Google, gentoo.org and cranial > cobwebs. So, thanks to all who replied, and even to those who showed > interest without replying. > > For http://fm.no-ip.com/Tmp/Linux/G/, newly created to use with those three > threads, 'cat /var/log/apache2/access_log | grep "GET /Tmp/Linux/G" | grep > -v <myip> | sort > outfile' generated 117 lines. That's a lot more hits > than I can ever remember getting before when asking for help from a > mailing list (even if it did take 5 days to accumulate so many). > > I'm curious if anyone here would like to offer a better variant of my local > query that would limit the hit count so that no more than one hit per IP is > represented in the output? My skill with such things is very limited. I > can't think of the the name of a command to cut the IP off the front of > each line, much less how to compare if it's a non-first instance to be > discarded. Or, maybe there's an Apache utility for doing this that I just > don't know about? There's always a million ways to skin a cat like this. At a high volume site you would of course not try and deal with this directly from the apache logs. You would send them to syslog which would parse them and write them to a database from where you could run sophisticated SQL. There are also Apache analyser apps out there, google will find them. But I think all that is overkill for what you want. Your command works fine except for needing to discard duplicate IPs. You don't seem to need to know the details of the GET, so just grab using awk the first field and sort | uniq the result. It will run a tad quicker (and reveal less n00bness to your audience) if you grep the file directly instead of cat | grep: grep "GET /Tmp/Linux/G" | /var/log/apache2/access_log | grep-v <myip> | awk '{print $1}' | sort | uniq | wc In true grand Unix tradition you cannot get quicker, dirtier or more effective than that -- alan dot mckinnon at gmail dot com |
is a nice "place" :-D
On Tue, May 17, 2011 at 01:33:39AM +0200, Alan McKinnon wrote:
> grep "GET /Tmp/Linux/G" | /var/log/apache2/access_log | grep-v <myip> | > awk '{print $1}' | sort | uniq | wc > > In true grand Unix tradition you cannot get quicker, dirtier or more effective > than that > You can replace "sort | uniq" by "sort -u" And the "Grand Unix Tradition" probably would 'cut' instead of awk :) While you are at it, an incantation that pipes grep to awk? Seriously? W -- Willie W. Wong wwong@math.princeton.edu Data aequatione quotcunque fluentes quantitae involvente fluxiones invenire et vice versa ~~~ I. Newton |
is a nice "place" :-D
On 2011/05/17 01:33 (GMT+0200) Alan McKinnon composed:
grep "GET /Tmp/Linux/G" | /var/log/apache2/access_log | grep-v<myip> | awk '{print $1}' | sort | uniq | wc In true grand Unix tradition you cannot get quicker, dirtier or more effective than that It almost worked too. :-) grep "GET /Tmp/Linux/G" /var/log/apache2/access_log | grep -v <myip> | awk '{print $1}' | sort | uniq | wc -l got me almost what I wanted, 20 unique IPs, but that's a lot of stuff to remember, which for me will never happen. So I tried converting to an alias. grep "GET $1" | /var/log/apache2/access_log | grep -v <myip> | awk '{print $1}' | sort | uniq | wc -l sort of works, except I won't always be looking for GET as part of what to grep for, or might require more than one whitepsace instance, and am tripping over how to deal with the whitespace if I leave GET out of the alias and only put on cmdline if I actually want it as part of what to grep for. grep "GET $1 $2" | /var/log/apache2/access_log | grep -v <myip> | awk '{print $1}' | sort | uniq | wc -l seems to work, but I'm not sure there aren't booby traps besides 2nd or more whitespace instances I'm not considering, even though it gets the same answer for this particular case. -- "The wise are known for their understanding, and pleasant words are persuasive." Proverbs 16:21 (New Living Translation) Team OS/2 ** Reg. Linux User #211409 ** a11y rocks! Felix Miata *** http://fm.no-ip.com/ |
is a nice "place" :-D
On Tue, 17 May 2011 01:33:39 +0200, Alan McKinnon wrote:
> grep "GET /Tmp/Linux/G" | /var/log/apache2/access_log | grep-v <myip> | > awk '{print $1}' | sort | uniq | wc > > In true grand Unix tradition you cannot get quicker, dirtier or more > effective than that > awk does pattern matching, o you can ditch the grep stage and use awk '! /myip/ {print $1}' You could use awk to search for the GET patterns too, not only saving yet another process, but making sure that no one else, including you next month, can work out what the command is supposed to do. sort -u would save having a separate process for uniq, but I've no idea if it's faster. It's only worth using sort -u if you would use uniq with no arguments. -- Neil Bothwick - We are but packets in the internet of Life- |
is a nice "place" :-D
On 2011-05-17, Neil Bothwick <neil@digimed.co.uk> wrote:
> On Tue, 17 May 2011 01:33:39 +0200, Alan McKinnon wrote: > >> grep "GET /Tmp/Linux/G" | /var/log/apache2/access_log | grep-v <myip> | >> awk '{print $1}' | sort | uniq | wc >> >> In true grand Unix tradition you cannot get quicker, dirtier or more >> effective than that >> > > awk does pattern matching, o you can ditch the grep stage and use > > awk '! /myip/ {print $1}' > > You could use awk to search for the GET patterns too, not only saving yet > another process, but making sure that no one else, including you next > month, can work out what the command is supposed to do. > Meh, me forgetting what an awk snippet do? Never! sed ... now that's a wholly different story :-P > sort -u would save having a separate process for uniq, but I've no idea > if it's faster. It's only worth using sort -u if you would use uniq with > no arguments. > And you can actually do the 'uniq' or '-u' function within awk. Quite easily, in fact. Here's a sample of awk doing uniq: awk '!x[$1]++ { print $1 }' Benefit? It doesn't care if the non-unique lines are one-after-another or spread all over the text. The above snippet prints only the first occurence. Combine that with a test for match: awk '!x[$1]++ && $0 ~ /awesome_regex_pattern/ {print $1}' then with a test for negated match awk '!x[$1]++ && $0 ~ /awesome_regex_pattern/ && $0 !~ /more_awesome_regex/ {print $1}' Rgds, -- Pandu E Poluan - IT Optimizer My website: http://pandu.poluan.info/ |
is a nice "place" :-D
On Tue, May 17, 2011 at 5:43 AM, Pandu Poluan <pandu@poluan.info> wrote:
> On 2011-05-17, Neil Bothwick <neil@digimed.co.uk> wrote: >> On Tue, 17 May 2011 01:33:39 +0200, Alan McKinnon wrote: >> >>> grep "GET /Tmp/Linux/G" | /var/log/apache2/access_log | grep-v <myip> | >>> awk '{print $1}' | sort | uniq | wc >>> >>> In true grand Unix tradition you cannot get quicker, dirtier or more >>> effective than that >>> >> >> awk does pattern matching, o you can ditch the grep stage and use >> >> *awk '! /myip/ {print $1}' >> >> You could use awk to search for the GET patterns too, not only saving yet >> another process, but making sure that no one else, including you next >> month, can work out what the command is supposed to do. >> > > Meh, me forgetting what an awk snippet do? Never! > > sed ... now that's a wholly different story :-P > >> sort -u would save having a separate process for uniq, but I've no idea >> if it's faster. It's only worth using sort -u if you would use uniq with >> no arguments. >> > > And you can actually do the 'uniq' or '-u' function within awk. Quite > easily, in fact. > > Here's a sample of awk doing uniq: > > awk '!x[$1]++ { print $1 }' > > Benefit? It doesn't care if the non-unique lines are one-after-another > or spread all over the text. The above snippet prints only the first > occurence. Combine that with a test for match: > > awk '!x[$1]++ && $0 ~ /awesome_regex_pattern/ {print $1}' > > then with a test for negated match > > awk '!x[$1]++ && $0 ~ /awesome_regex_pattern/ && $0 !~ > /more_awesome_regex/ {print $1}' > > Rgds, > -- > Pandu E Poluan - IT Optimizer > My website: http://pandu.poluan.info/ > > I have always wondered if there is a way to do awk '{ print $1}' using only builtin bash functions when you only have a one line string |
is a nice "place" :-D
Juan Diego Tascón writes:
> I have always wondered if there is a way to do awk '{ print $1}' using > only builtin bash functions when you only have a one line string str="one two five" # remove all from the first blank on, but will not work with # other whitespace echo ${str%% *} or # set $1, $2, $3, ... to words of $str set $str echo $1 or # create array holding one word per element strarr=( $str ) echo $strarr (or echo ${strarr[0]}) Wonko |
is a nice "place" :-D
On Tue, May 17, 2011 at 8:36 AM, Alex Schuster <wonko@wonkology.org> wrote:
> Juan Diego Tascón writes: > >> I have always wondered if there is a way to do awk '{ print $1}' using >> only builtin bash functions when you only have a one line string > > str="one two five" > > # remove all from the first blank on, but will not work with > # other whitespace > echo ${str%% *} > > or > > # set $1, $2, $3, ... to words of $str > set $str > echo $1 > > or > > # create array holding one word per element > strarr=( $str ) > echo $strarr *(or echo ${strarr[0]}) > > * * * *Wonko > > thanks for the info |
is a nice "place" :-D
Hello,
On Tue, 17 May 2011, Alan McKinnon wrote: >grep "GET /Tmp/Linux/G" | /var/log/apache2/access_log | grep-v <myip> | >awk '{print $1}' | sort | uniq | wc useless use of ... awk '/GET /Tmp/Linux/G/{ips[$1]++;}END{print length(ips);}' /var/log/apache2/access_log I add each access to ips[<IP>] in case you'd want to print that to, e.g. by using END { for( i in ips ) { print i ":" ips[i] " accesses"; } print length(ips) " unique IPs total"; } as the "END" block. HTH, -dnh -- Any research done on how to efficiently use computers has been long lost in the mad rush to upgrade systems to do things that aren't needed by people who don't understand what they are really supposed to do with them. -- Graham Reed, in asr |
| All times are GMT. The time now is 05:51 PM. |
VBulletin, Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.