| Ricci Street < Port 80 < Customhouse ||
search | sitemap | help gazette | theater | bistro |
| | |
|
browser | cache | hit | request | server
| server log
server log entry | visit, visitor
an example
The browser is the type of software with which you surf the Web. Netscape's Navigator (NN) and Microsoft's Internet Explorer (IE) are the most common browsers at Ricci Street as well as the rest of the Web.
The folks at browserstats.com categorize over four thousand different browsers.
To cache Web pages locally means to store them on the browser's computer. The page always displays faster when you hit your Back button because it is coming from your machine, not the original server.
To cache Web pages remotely means to store them on your ISP's computer. The practice is widespread because it speeds delivery to you and reduces the ISP's bandwidth usage. Non-U.S. universities use a cache like a reserve book collection to also reduce telephone charges.
From the orginal website's point of view, caching is a problem
only because it causes the visits and visitors to be undercounted.
![]()
A request is closest to what people usually mean by hit. Hits are every transaction. Requests are successful hits.
When you click on a link, for example, sitestats.htm, what happens?
1) The browser looks in your hard drive's cache for the .htm file and for any image, script, or other files embedded in it. If it does find the page, it compares the local "last-modified" timestamp with that on the remote server where sitestats.htm is stored. If the local timestamp is the same, the browser displays the page off your local cache.
2) If the files aren't in your local cache, the browser looks to any caches at the gateway where your local network or your service provider connects to the Internet.
3) If the files aren't in one of those caches, only then does the browser make its request to the server. It makes one request for each file. Each of those requests is logged.
See the specific example below.
The computer that stores web pages and delivers them upon a browser's request is called a server. The server is usually characterized as remote while the browser, or client, is characterized as local. The term server also refers to the software that processes requests from clients.
The server records some of the details of browser requests in a separate file called a log. For the Apache server used by Ricci Street, this separate file is called access_log. That's the file I download every month and run through a log analyzer like OpenWebScope.
What does an entry look like?
http://tolearn.net/coreskills.htm Mozilla/2.0 (compatible; MSIE 3.02; Update a; Windows 95) spc-isp-ott-uas-10-36.sprint.ca - - [29/Nov/1998:11:36:32 -0500] GET /marketing/images/courselogol.gif HTTP/1.0 200 13083
How does it read?
It tells me the page you're coming from, the browser and the operating system you're using, your ISP, the date and time, and the specific file you're requesting, in this case a .gif. The gif is 200 pixels high and 13 KB in size.
To make server logs more useful, the OpenWebScope software groups requests in two ways.
A visit is a series of consecutive requests from a specific IP number.
A visitor is a specific IP number. Most Internet service providers assign IP numbers from a range available as accounts dial in.
WebTrends, the industry leader, slices the same data a little differently.
Student John during his 6 pm class uses Netscape's Navigator
browser (NN4) to request a web page from the server housing toLearn.net. Along
with the text, the page has three embedded images for a total of four separate
files that need to be displayed on John's screen.
The browser looks first in its own cache. It does not find the
page or any of the images, so it requests four separate files from server.
Student Sue, at the same machine during her 8 pm class, uses NN4 to request the
same web page from toLearn. The browser looks first in its own cache. It
does find the page and the three images. It compares the local
"last-modified" timestamp with that on the server. If the local
timestamp is the same, it displays the page for Sue off the local cache.
On the server's log, this example would count as one visit, one visitor, and
four requests because the server never got any requests from Sue. It would also
count as one 304 code, the 304 referring to timestamp comparison.
I do not know whether Microsoft's Internet Explorer acts the same way. Do you?
|
||||||||||||||||||||||||||||||||||||