Log File Data for Web Traffic Analysis
Many web hosting servers use one of several standard log file formats when creating their logs. These standards enable commercial log analysis programs to read the log files of all popular web servers.
The original web server programs create logs in a common log format that uses several files to store all the information collected about files served. These log files include:
Access_log: The main log file that capture filenames, IP addresses, dates, times and other data.
Referer_log: The file that capture URLs of the Web sites from which user came.
Error_log: The file that include requests for files and system error messages.
Data in Log Files
The common log files format has eight fields of information for each HTML and graphic file served.
Address field: Either the IP address or domain name from which the request came.
Id field: The Id field generally not used, for security and privacy reasons.
AuthUser field: Used when username and password authentication is required of the user to see the pages.
Date and Time field: HTTP command that the user’s web browser sent to the web server to make the request; usually either the GET command, for requesting static HTML and graphic files, or the POST command for data supplied from a form. Can also be the HEAD command for request from certain agents, such as search engine spiders to search engine optimization.
File name field: This is the name of file which is served to the request sender.
Status field: Status or error code, indicating whether the request was successful.
Size field: This is the size of the file, which is served to request sender.
