Apache2 VHosts and AWStats within a clustered environment?
Hi all,
I'm not sure if "clustered" is exactly the phrase I'm looking for here, but nonetheless, here is my issue: We have a "cluster" of web-servers configured using heartbeat as follows (only two servers illustrated for clarity's sake!): ----------- ----------- |Director1| |Director2| ----------- ----------- | / | | / | | / | | / | | X | | / | | / | ------------ ------------ |WebServer1| |Webserver2| ------------ ------------ The servers share a number of vhosts - many of which are dynamic sites using PHP/MySQL -for which we need to provide AWstats info for our customers. Our concern is that as we have individual log files for each VHost on each individual webserver, the AWStats information cannot be guaranteed to be accurate. One of the options that we have discussed is logging all the vhost's log files to a central log server via NFS (in order to keep the log format) and then have each AWStats instance on each server read the logs from the central NFS share when it comes to update the graphs etc. The Issue I have found is that a number of people (include Tony Mobily in the "Hardening Apache" book) appear to recommend against this. The other option that we have spoken about is copying all the log files to a central server each day, merging them into a single file, sorting the file and then running AWStats against that file before copying all the graphs etc back to the webServers. I can see two distinct disadvantages to this: 1) There will be a huge amount of network traffic as the logs are shuttled back and forth between the servers 2) If AWStats is updated from the browser interface, it will only be updated on the server that holds the current connection I'm sure that someone out there must have been in this situation before, so how did you do it? :oP Unfortunately, switching from AWStats to another package is not an option. Fortunately, everything else is fair game! :o) Thanks in advance, Matt -- Matthew Macdonald-Wallace matthew@truthisfreedom.org.uk http://www.truthisfreedom.org.uk -- To UNSUBSCRIBE, email to debian-isp-REQUEST@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org |
Apache2 VHosts and AWStats within a clustered environment?
This one time, at band camp, Matthew Macdonald-Wallace said:
> I'm sure that someone out there must have been in this situation > before, so how did you do it? :oP Switch your log backend to either SQL or something that can export to a remote host (syslog, etc) and aggregate the log messages as they come in. -- ----------------------------------------------------------------- | ,'`. Stephen Gran | | : :' : sgran@debian.org | | `. `' Debian user, admin, and developer | | `- http://www.debian.org | ----------------------------------------------------------------- |
Apache2 VHosts and AWStats within a clustered environment?
Hi Matthew,
we tried both variants, NFS and copying logfiles nightly. NFS was a performance killer and somewhat unstable then (think concurent writes). Copying/merging/sorting several hundred GB logdata nightly ran into problems once the volume was too large to be handled within 24 hours ("daily stats"). Having logs on an NFS host and processsing them via NFS (from the web servers, as you suggested) at least doubles network traffic, so this never was an option to us even then. But ymmv, it could work for small businesses ;-) Did you check out the log-pipe feature of Apache? http://httpd.apache.org/docs/2.2/logs.html#piped You might be able to distribute logging to one ore many central log hosts this way. We use a homemade daemon on each web server to streamline the piping somewhat (caching, some pre-processing etc.) Additional plus: You can accumulate data into near-realtime stats for your customers, set up traffic limits, accounting, you name it. Just make sure you have sufficient bandwidth so you do not stuff up the lines with log traffic. Maybe just add an extra hardware interface for logging (and a switch, if this is the bottleneck). Then set up as many stats-servers as needed to process the logfiles in time and for your customers to grab their graphs. Additional tips: Set up your own Aapche logfile format to include the vhost name ;-) If your customers are used to find their stats on, say theirdomain.com/stats/ just set up a system-wide redirect to your stats-server(s). Ah, and: We do not offer on-the-fly awstats generation, so that eases the pain somewhat ;-) Cheers, Norbert -- To UNSUBSCRIBE, email to debian-isp-REQUEST@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org |
Apache2 VHosts and AWStats within a clustered environment?
Hi Matthew,
forgot to mention: http://mod-log-spread2.alioth.debian.org/ Cheers, Norbert -- To UNSUBSCRIBE, email to debian-isp-REQUEST@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org |
Apache2 VHosts and AWStats within a clustered environment?
On Thu, Jun 26, 2008 at 03:21:08PM +0100, Matthew Macdonald-Wallace wrote:
[...] > Our concern is that as we have individual log files for each VHost on > each individual webserver, the AWStats information cannot be guaranteed > to be accurate. > > One of the options that we have discussed is logging all the vhost's > log files to a central log server via NFS (in order to keep the log > format) and then have each AWStats instance on each server read the > logs from the central NFS share when it comes to update the graphs > etc. The Issue I have found is that a number of people (include Tony > Mobily in the "Hardening Apache" book) appear to recommend against this. Ouch. Yes, not something I'd do. > The other option that we have spoken about is copying all the log files > to a central server each day, merging them into a single file, sorting > the file and then running AWStats against that file before copying all > the graphs etc back to the webServers. I can see two distinct > disadvantages to this: > > 1) There will be a huge amount of network traffic as the logs are > shuttled back and forth between the servers are you aware of logresolvemerge? http://awstats.sourceforge.net/docs/awstats_tools.html#logresolvemerge it's quite useful for this sort of thing, ime. Another way might be to rotate off the logfiles hourly, or something, and rsync those over (to a central repo), merge, and analyse; that's still a bit messy, imo. > 2) If AWStats is updated from the browser interface, it will only be > updated on the server that holds the current connection Disable that "feature", and update, say hourly via cron. -- To UNSUBSCRIBE, email to debian-isp-REQUEST@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org |
Apache2 VHosts and AWStats within a clustered environment?
Thanks for all your replies.
After careful consideration, we've gone with mod_log_mysql with a cron job to pull the data onto an NFS share once a day from which the nodes in the cluster read the log-files and generate the awstats. At the moment, it's working fine in the labs, we're going to run apache-benchmark on it over the weekend to see if we can break it! :o) Thanks again, M. -- Matthew Macdonald-Wallace matthew@truthisfreedom.org.uk http://www.truthisfreedom.org.uk -- To UNSUBSCRIBE, email to debian-isp-REQUEST@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org |
| All times are GMT. The time now is 08:15 AM. |
VBulletin, Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.