CSc 231 Assignment 2 | |
CSc 231 Perl Assignments |
70 pts |
Bringing You All The Hits |
Due: Mar 9 |
Write a program to compute and print statistics from the Apache web server log on Sandbox. Here's a run of mine:
GET
in the above example, which
is by far the most common.
The number of unique visitors is just the number of unique client IP addresses. Use a hash to keep track of the ones you've seen before.
For the hits per hour, compute the time period for each file as the difference between the last and first time stamp. Compute that time in hours for each file, total the times, and divide the total hits by the total time.
For computing the time from the timestamp in the file, first extract
the parts. You can ignore the last part, which is the time zone,
as our server does not relocate during operation.
A pattern works well for this, or you can use split
.
When you get the parts, you can use timelocal
to convert the time
to a Unix time stamp in seconds. The difference the first and
last times will be the length of time covered by the file in seconds.
On Sandbox, say man Time::Local
to find out how to use this beastie.
You may also want to run man localtime
and see the description of
the fields in struct tm
. These are relevant to the ranges of the
values.
Generally, hashes are your friends. They're very useful for keeping
track of which visitors you have already seen, and for keeping counts by
request type or code. You may also find one useful to map month names
from the log file to numbers for timelocal
.
You may want to study the example program which also reads the log files. It is rather different from this assignment, but has some things in common, and some code you might want to swipe.
The log files on Sandbox are located in /var/log/httpd/
. The current
one, which the server is writing is access_log
, and older ones are
access_log.1
, access_log.2
, etc. On Sandbox, they are set to be
publicly readable.
If you want to test somewhere else than Sandbox, you can copy a log file,
or a portion of one, to another computer for testing. In any case, you
should create at least one test file containing just a few lines from
the log file. This will allow you to compute the correct answers by
hand in order to test your program.