Introduction to HTML

 

index123456

HTTP for HTML Authors, Part I

Mapping URLs to Files

The most basic set-up of any Web server usually works like this: the path bit in any URL is interpreted as a filename on the computer's hard disk. Usually, the Web server is configured to have a “document root” directory relative to which all URLs are resolved as filenames. Let me give you an example: Let's assume you're running a simple Web server on a Windows-based computer called arnie.acme.com, and the document root is D:\WWWFiles\. When a user types the URL http://arnie.acme.com/jokes/blondes.htm into his browser, the browser asks the server for the document /jokes/blondes.htm. The server will look in the directory D:\WWWFiles\jokes for a file called blondes.htm. If it's there, it will return a 200 OK response followed by the document; if it's not, it will return a 404 Not Found response followed by a helpful error message telling him to look elsewhere for bleached comic relief.

Practically every Web server in existence is set up to behave like this by default, and most hosting services offer only this kind of functionality; you put some files in a directory on the computer running the server (usually using a file transfer utility like FTP), and you can then access them using HTTP.

In this kind of set-up, when the user asks for a path that corresponds to a directory on the hard disk, for instance /jokes/, the server will probably return a list of files in that directory. Since this is rarely a nice way to organize a Web site, most servers are set up to look for a file called the index in the directory and display it instead.

Traditionally, index files are named index.html or index.htm. So, if you're running a Web server, and you have a directory called jokes filled with a bunch of un-politically correct documents, and you want people to see an HTML document that links to these documents when they ask for the /jokes/ document, you'd create a file called index.html and put it into this directory. Now, when a browser asks for the document /jokes/, he'll get the contents of the file D:\WWWFiles\jokes\index.html.

The above is only an example; the exact name of an index file, or the behaviour of a server when it is asked to return a document that corresponds to a directory, varies from server to server and also according to the server's configuration. If you're running your own server, check the documentation that came with the software; if you're renting Web space from a hosting provider, check the documentation for the service.

However, index files serve to illustrate a basic point: Even though it is often the case, a URL's path element does not always correspond to an identically named file. In the example above, you effectively asked for the directory called /jokes/, but you got the file /jokes/index.html instead. This is because it's a lot more flexible to be able to “map” URLs to files in a more complicated manner.

index123456

 

Bison HTML Home