8.14.2008

Urchin Q and A: Time Zones and Apache Logs

Question:

How does Urchin use the time zone offset found in Apache logs?

Answer:

Urchin uses the time zone offset to adjust the date on which the hit occurred. For example, if a log file hit occurs at 3 AM on January 1, and the Time Zone Offset is 4 hours, then Urchin will update the internal database for December 31 and not January 1.

Here's an actual log file hit that I processed to illustrate the behavior. Note the date and time of the hit and the 'Day' that urchin updates in the internal database.

Hit: 206.188.4.165 - - [16/Jul/2006:03:11:03 -0400] "HEAD /healthcheck.htm HTTP/1.1" 200 0 "-" "-" "-"
apache_time [16/Jul/2006:03:11:03 -0400]
c_ip 206.188.4.165
cs_request HEAD /healthcheck.htm HTTP/1.1
sc_status 200
sc_bytes 0
cs_useragent -
cs_cookie -
cs_referer -
cs_host -
request_method HEAD
request_url /healthcheck.htm
request_version HTTP/1.1
request_uri /healthcheck.htm
request_stem /healthcheck.htm
request_directory /
request_filename healthcheck.htm
request_mime htm
request_origfilepath /healthcheck.htm
request_origmime htm
useragent_complete (unknown) - (unknown)
browser_base (unknown)
platform_base (unknown)
log_source_name 1-gold-web
nonpages 1
hits 1
validhits 1
nonutmhits 1
nonrobothits 1
HDB Update(Table 9, Day 15): (unknown) - (unknown) 1
HDB Update(Table 12, Day 15): /healthcheck.htm 1
HDB Update(Table 13, Day 15): htm 1
HDB Update(Table 14, Day 15): 200 1
HDB Update(Table 26, Day 15): 1-gold-web 1

No comments: