Thursday, August 26, 2010

Get website metrics with AWStats

Log file analysis is interesting. Seriously, who doesn’t want to know how many hits they get on their website, what page or file is most popular, what operating systems or browsers are visiting, or from what country the majority of visitors is coming from? I think we would all agree that anyone who runs a website likes to know this kind of information.

AWStats is a best-of-breed log file analysis program. Primarily, it analyzes log files for web servers: Apache, IIS, and others, including proxy servers such as Squid. Interestingly enough, it can also be used to analyze log files for FTP servers and mail servers.

Interested yet? Free information! I’m an information nut and love looking at or making up statistics, so AWStats suits me quite well. Written in perl, AWStats is probably the most widely used log analysis program. It can be run in real time as a CGI script, or can be run periodically from cron to provide static pages. Running AWStats every few hours is generally enough to keep the overhead down, but if you look at it rarely, running it as a CGI might be a better fit. Whichever works best for you, AWStats can accommodate.

The current version of AWStats is 6.95, and it can be downloaded from the home page as a tar file, zip file, or noarch RPM file. If you run CentOS or Red Hat Enterprise Linux and have the RPMForge third-party repository setup, you can use yum to install the latest version of AWStats, likewise with Fedora. For the Debian and Ubuntu users, AWStats is available via apt-get or Aptitude.

Once AWStats is installed, you should be able to immediately get the CGI to load. There might not be a lot there as of yet, but it should load. If AWStats is located in /var/www/awstats/, set the following in your directive for the domain you wish to view:

Alias /awstats/icon/ /var/www/awstats/icon/

ScriptAlias /awstats/ /var/www/awstats/



DirectoryIndex awstats.pl

Options ExecCGI

order deny,allow

deny from all

allow from 192.168



Then you should be able to visit http://www.yourdomain.com/awstats/awstats.pl and be given a good healthy error. This is due to the fact that no configuration has been done as of yet, but we can circumvent this and use the default “localhost.localdomain” configuration file that is present by visiting http://www.yourdomain.com/awstats/awstats.pl?config=localhost.localdomain instead. (This is assuming that you use the RPMForge package; if you grab the tar or zip file, you need to create this file and move the files into place first — you can do this by creating /etc/awstats/ and copying awstats-6.95/wwwroot/cgi-bin/awstats.model.conf from the distribution archive to /etc/awstats/awstats.localhost.localdomain.conf.)

From there you can also copy that file to /etc/awstats/awstats.www.domain.com.conf as well, to view the statistics for the chosen domain name. You will want to edit the file and at a bare minimum, set the log file to examine:

LogFile="/var/log/httpd/intranet-access_log"

SiteDomain="domain.com"

HostAliases="www.domain.com"

This would tell AWStats that for this configuration file, /var/log/httpd/intranet-access_log is the log file to parse. Before firing up the new URL, however, you need to update the database, which can be done by creating /etc/cron.hourly/awstats with the following contents:

#!/bin/bash

if [ -f /var/log/httpd/access_log ] ; then

exec /usr/bin/awstats_updateall.pl now -confdir="/etc" -awstatsprog="/var/www/awstats/awstats.pl" >/dev/null 2>&1

fi

exit 0

The above assumes certain path locations for where scripts have been installed. The awstats_updateall.pl script is in the awstats-6.95/tools/ directory if you downloaded the tar or zip files. This script will also run hourly due to its placement in /etc/cron.hourly/, to keep the database updated.

Now you can visit http://www.domain.com/awstats/awstats.pl?config=www.domain.com and view the AWStats statistics page.

Getting AWStats to parse mail and FTP logs is just as easy, and the online documentation is quite helpful (and the configuration files are very heavily commented).

AWStats provides a lot of statistics in its pages. The information it provides can really provide some insight as to how other people view your site, and who they are. For those looking to improve or maximize their site according to viewer demographics, AWStats information can prove to be invaluable.

No comments:

Post a Comment