Awstats

From Digital Scholarship Group
Jump to navigation Jump to search

[Note: this page is about our conversion from the old awstats system (6.5) to the new awstats system (7.7) that Syd & Ash are working on as I write this, i.e. summer 2020. We are primarily interested in documenting the new, but want some breadcrumbs or comparisons to the old sometimes, too.]

cron job

Currently (on production), user root runs a cron job once every 42 hours that executes /var/www/cgi-bin/awstats/awstatsRunAll.sh. We are likely to change that to 24 for the new system.

new paths

“In Unix, 90% of all errors are either paths or permissions.” — Paul Caton, circa 2007.

Herein is just a list of the pertinent places as they currently exist on wwp-test, and supposedly will exist on wwp soon.

/var/log/apache2/
The log files that are to be analyzed are stored here (and in particular in the HOLD/ subdirectory (which used to be called OLD/, but that caused some problems.:-)).
/usr/share/awstats/
The installation of awstats itself, as used in our RHEL 6 servers. The main Perl script that powers awstats is in the wwwroot/cgi-bin/ subdirectory. The mainstay of local mods have been made in the tools/ subdirectory, which is under version control. Note that when generateAndRunAllConfFiles.bash is run by anacron (via the /etc/cron.daily/generateAndRunAllConfFiles.bash symlink) it puts the debugging messages generated by echo into /var/mail/root.
/usr/local/awstats
The installation of awstats itself, created by the awstats v7.8 installer on RHEL 8. The main Perl script that powers awstats is in the wwwroot/cgi-bin/ subdirectory. The mainstay of local mods have been made in the tools/ subdirectory, which is under version control. Note that when generateAndRunAllConfFiles.bash is run by anacron (via the /etc/cron.daily/generateAndRunAllConfFiles.bash symlink) it puts the debugging messages generated by echo into /var/mail/root.
/var/lib/awstats/
A directory awstats uses by default, but we do not use it. It may exist, but don’t bother poking around in there for our stuff.
/var/www/html/usage/awstats/
The output of the actual awstats command (/usr/share/awstats/wwwroot/cgi-bin/awstats.pl) goes here. It is launched by /usr/share/awstats/tools/runStats.bash, reads in the config file provided as a parameter (which is typically in /etc/awstats/), and writes out these puppies. Note also that we need a copy of index.php here, which we currently handle by symlinking so we can keep it in a directory that is under version control.
/etc/awstats/
The configuration files generated by makeConfFiles.pl are stored here. (It, BTW, is run from /usr/share/awstats/tools/generateAndRunAllConfFiles.bash.) If you want to make a change to the configuration files, either change that program or the template file it reads (which is /usr/share/awstats/tools/awstats.wwp.template.conf).

old paths

  • /var/www/cgi-bin/awstats/
  • /var/www/cgi-bin/awstats_priv/
  • /var/www/awstats/ (We’re not 100% about this one.)
  • /etc/awstats/
  • /var/www/html/usage/awstatsMain/
  • /var/www/html/usage/awstats/
    • Is symlinked to /var/www/usage/awstats, we don’t know why
    • has subdir /var/www/html/usage/awstats/cronReports/
  • /var/log/apache2/ (used OLD/ instead of HOLD/ child dir)

Missing AWStats data

On 2020-09-08, we copied all awstatsMOYEAR.txt files from the backup data files (/mnt/WWPprod-Data/backup_awstats_6.5_stuff_pre-install_7.7_2020-09/2020-09-04T11:04:16/usage/awstats/), to the new institution data folders, AS LONG AS that filename didn't already exist in the target directory. (We used cp -n.) This action gave us AWStats data from before April 2017, augmenting what we already had for April 18, 2017 and onward. This leaves a gap in data from April 1 to 17, 2017.

We found that the missing chunk of data from April 2017 was present in the backup directory, but since the 042017 file already existed (i.e., in both the backup and the target directory), it was not copied over and the 04-01–04-17 data was thus not showing up on the web.

To test our options, we deleted all Northeastern_University_Private_IPs data from 2017-05 onward. We moved the 042017 data file (which has the data for 2017-04-18 to the end of 2017-04) out of the way (by renaming it), and copied the 042017 data file from the backup (which has the 04-01–04-17 data) into its place. We also changed the name of the "dnscache" file so it would be regenerated. Finally, we re-ran the AWStats generation process on the institution, using /usr/bin/perl /usr/share/awstats/wwwroot/cgi-bin/awstats.pl -config=Northeastern_University_Private_IPs -update (which we copied from /usr/share/awstats/tools/runStats.bash :-).

As a result, we retained the data from 04-01–04-17, converted from plaintext to XML format (becaues we’re using XML in 7.7; we had used plain text in 6.5). We did not get the data from 04-18+ back. The institution also did not pick up logged data from May 2017.