FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Debian > Debian ISP

 
 
LinkBack Thread Tools
 
Old 06-26-2008, 02:21 PM
Matthew Macdonald-Wallace
 
Default Apache2 VHosts and AWStats within a clustered environment?

Hi all,

I'm not sure if "clustered" is exactly the phrase I'm looking for here,
but nonetheless, here is my issue:

We have a "cluster" of web-servers configured using heartbeat as
follows (only two servers illustrated for clarity's sake!):

----------- -----------
|Director1| |Director2|
----------- -----------
| / |
| / |
| / |
| / |
| X |
| / |
| / |
------------ ------------
|WebServer1| |Webserver2|
------------ ------------


The servers share a number of vhosts - many of which are dynamic sites
using PHP/MySQL -for which we need to provide AWstats info for our
customers.

Our concern is that as we have individual log files for each VHost on
each individual webserver, the AWStats information cannot be guaranteed
to be accurate.

One of the options that we have discussed is logging all the vhost's
log files to a central log server via NFS (in order to keep the log
format) and then have each AWStats instance on each server read the
logs from the central NFS share when it comes to update the graphs
etc. The Issue I have found is that a number of people (include Tony
Mobily in the "Hardening Apache" book) appear to recommend against this.

The other option that we have spoken about is copying all the log files
to a central server each day, merging them into a single file, sorting
the file and then running AWStats against that file before copying all
the graphs etc back to the webServers. I can see two distinct
disadvantages to this:

1) There will be a huge amount of network traffic as the logs are
shuttled back and forth between the servers
2) If AWStats is updated from the browser interface, it will only be
updated on the server that holds the current connection

I'm sure that someone out there must have been in this situation
before, so how did you do it? P

Unfortunately, switching from AWStats to another package is not an
option. Fortunately, everything else is fair game! )

Thanks in advance,

Matt
--
Matthew Macdonald-Wallace
matthew@truthisfreedom.org.uk
http://www.truthisfreedom.org.uk


--
To UNSUBSCRIBE, email to debian-isp-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
 
Old 06-26-2008, 03:15 PM
Stephen Gran
 
Default Apache2 VHosts and AWStats within a clustered environment?

This one time, at band camp, Matthew Macdonald-Wallace said:
> I'm sure that someone out there must have been in this situation
> before, so how did you do it? P

Switch your log backend to either SQL or something that can export to a
remote host (syslog, etc) and aggregate the log messages as they come
in.
--
-----------------------------------------------------------------
| ,'`. Stephen Gran |
| : :' : sgran@debian.org |
| `. `' Debian user, admin, and developer |
| `- http://www.debian.org |
-----------------------------------------------------------------
 
Old 06-26-2008, 04:28 PM
 
Default Apache2 VHosts and AWStats within a clustered environment?

Hi Matthew,

we tried both variants, NFS and copying logfiles nightly.
NFS was a performance killer and somewhat unstable then (think concurent
writes). Copying/merging/sorting several hundred GB logdata nightly ran
into problems once the volume was too large to be handled within 24
hours ("daily stats"). Having logs on an NFS host and processsing them
via NFS (from the web servers, as you suggested) at least doubles
network traffic, so this never was an option to us even then. But ymmv,
it could work for small businesses ;-)

Did you check out the log-pipe feature of Apache?
http://httpd.apache.org/docs/2.2/logs.html#piped
You might be able to distribute logging to one ore many central log
hosts this way. We use a homemade daemon on each web server to
streamline the piping somewhat (caching, some pre-processing etc.)
Additional plus: You can accumulate data into near-realtime stats for
your customers, set up traffic limits, accounting, you name it. Just
make sure you have sufficient bandwidth so you do not stuff up the lines
with log traffic. Maybe just add an extra hardware interface for logging
(and a switch, if this is the bottleneck).
Then set up as many stats-servers as needed to process the logfiles in
time and for your customers to grab their graphs.

Additional tips: Set up your own Aapche logfile format to include the
vhost name ;-) If your customers are used to find their stats on, say
theirdomain.com/stats/ just set up a system-wide redirect to your
stats-server(s). Ah, and: We do not offer on-the-fly awstats generation,
so that eases the pain somewhat ;-)


Cheers, Norbert


--
To UNSUBSCRIBE, email to debian-isp-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
 
Old 06-26-2008, 04:31 PM
 
Default Apache2 VHosts and AWStats within a clustered environment?

Hi Matthew,

forgot to mention:
http://mod-log-spread2.alioth.debian.org/

Cheers, Norbert


--
To UNSUBSCRIBE, email to debian-isp-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
 
Old 06-26-2008, 07:50 PM
Adam McGreggor
 
Default Apache2 VHosts and AWStats within a clustered environment?

On Thu, Jun 26, 2008 at 03:21:08PM +0100, Matthew Macdonald-Wallace wrote:

[...]

> Our concern is that as we have individual log files for each VHost on
> each individual webserver, the AWStats information cannot be guaranteed
> to be accurate.
>
> One of the options that we have discussed is logging all the vhost's
> log files to a central log server via NFS (in order to keep the log
> format) and then have each AWStats instance on each server read the
> logs from the central NFS share when it comes to update the graphs
> etc. The Issue I have found is that a number of people (include Tony
> Mobily in the "Hardening Apache" book) appear to recommend against this.

Ouch. Yes, not something I'd do.

> The other option that we have spoken about is copying all the log files
> to a central server each day, merging them into a single file, sorting
> the file and then running AWStats against that file before copying all
> the graphs etc back to the webServers. I can see two distinct
> disadvantages to this:
>
> 1) There will be a huge amount of network traffic as the logs are
> shuttled back and forth between the servers

are you aware of logresolvemerge?
http://awstats.sourceforge.net/docs/awstats_tools.html#logresolvemerge
it's quite useful for this sort of thing, ime.

Another way might be to rotate off the logfiles hourly, or something,
and rsync those over (to a central repo), merge, and analyse; that's
still a bit messy, imo.

> 2) If AWStats is updated from the browser interface, it will only be
> updated on the server that holds the current connection

Disable that "feature", and update, say hourly via cron.


--
To UNSUBSCRIBE, email to debian-isp-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
 
Old 06-27-2008, 02:58 PM
Matthew Macdonald-Wallace
 
Default Apache2 VHosts and AWStats within a clustered environment?

Thanks for all your replies.

After careful consideration, we've gone with mod_log_mysql with a cron
job to pull the data onto an NFS share once a day from which the nodes
in the cluster read the log-files and generate the awstats.

At the moment, it's working fine in the labs, we're going to run
apache-benchmark on it over the weekend to see if we can break it! )

Thanks again,

M.
--
Matthew Macdonald-Wallace
matthew@truthisfreedom.org.uk
http://www.truthisfreedom.org.uk


--
To UNSUBSCRIBE, email to debian-isp-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
 

Thread Tools




All times are GMT. The time now is 03:49 PM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org