Custom Web Page Hit Count Reports from
daily access_logs generated by Apache servers

BACK TO HOME PAGE

Creates meaningful, easy to read one-page monthly reports; a running report for this month, and a final report for last month. Counts hits on specific web pages, including all files of a given type in a directory, and hits from search engine result page links. Lists search strings used for page visits. Does not count returns to home page from within the website, or, in most cases, refresh of a page. Provides a mechanism to ignore webmaster hits. It goes beyond Webtrends and eXTReMe Tracking.

This month's running report

Last month's final report

Previous month's final report

Skipped hits for checked-in webmaster
This shows pages not counted while the webmaster was checked in.

Yesterday's history log
This shows referring pages, which is the page the person was on when going to this website.

Running search log
These are hits from results page of search engines. Shows search string and page visited.
Note that the history log above shows ALL referring pages except those from this website.

Page-by-page search report

This website is hosted by Hurricane Electric Internet Services (www.he.net) in Fremont, CA. H.E. uses /home/accountname/logs/access_log for their accounting purposes. They will roll over the log to /home/accountname/access_log before they delete their log at midnight. My hit-count program runs after midnight and analyzes the rolled-over access_log. It is an alternative to Hurricane Electric's "Status Report" -> "Web Activity Report" -> "Total Transfers By File".

  • This hit-count program is a cron job that runs at 1 A.M. every night, and processes yesterday's data.
  • It generates a running report in web page format for the current month.
  • At end of month it creates a final web page report for last month.
  • It tracks hits only for desired pages that are listed in text file pagelist.txt. For all file types in a directory, like all the pdf files in the news directory, news/*.pdf can be specified.
  • It does not count most page refreshes, nor return to home page from this website.
  • Webmaster hits can be omitted from the count via a webmaster check-in page, or by copying index.html to nocount.html (see nixlist.txt info below).
  • It treats /, /xxxxx and /xxxxx/ as hits on index.htm (or index.html), where xxxxx is a directory.
  • It shows hits from results pages of search engines listed in file searchlist.txt. (e.g. google.com, yahoo.com). For example, this is where someone did a Google search and clicked on a link that went to the website.
  • As shown above at the link for "Running search log", the search log is a running file that clearly shows the page referenced, search engine, and search string.
  • Hit counts are ignored when a referrer is listed in file nixlist.txt. To ignore webmaster hits, the home page can be copied to, for example, nocount.html. Put "nocount.html" in nixlist.txt, and use nocount.html as the webmaster's home page.
  • The History Log shows execution detail for a day's run, including run time, time of the first access_log record, referring pages, total pagecounts skipped due to webmaster hits, refresh, returns to home page, etc.

See other websites using my hit count program:
Hewlett Packard Bay Area Retirees Club
Sunset Oaks Homeowners Association

A copy of this 1200-line highly documented and commented Perl program is available for $35.00.
See email address at the top of my home page. Note that your web server might handle access_log differently. Your webmaster should be familiar with doing the following web server procedures.

1. telnet or SSH logon access to the web server, and basic UNIX knowledge.
2. Modify configuration parameters in the Perl script. FTP PUT to cgi-bin/. chmod 755.
3. On the web server, cd /home/accountname, touch access_log.
4. On the web server, do crontab -e and create the crontab entry
     0 1 * * * /home/accountname/cgi-bin/page_hits.pl.

Click here to view the configuration section of the hit count program.

- Jim Hartsell