Introduction to HTML

index123456

HTTP for HTML Authors, Part I

Fetch, boy! What browsers do with URLs

If you've read Tutorial 2, you should be well acquainted with URIs, and especially URLs. Here's a quick dissection of an http URL to refresh your memory:

Anatomy of an http URL
Anatomy of an http URL

Astute readers will note that the http bit means that this URL uses the http scheme. This governs how the rest of the URL is understood by a Web browser.

The server bit (webreference.com) and the port bit (80) tell the browser where to go looking for the Web site. The server denotes a computer connected to the Internet; the port denotes a sort of “socket” to which the browser plugs in to speak with the Web server.

The term “Web server” is often used to describe a computer that serves out Web pages, but is also used to describe a computer program that runs on a computer and serves out Web pages; this is how I'll use the term for the purposes of this tutorial. The person in charge of the computer runs this program, which then starts listening on a port for the first user to come a-browsing.

Using the, um, anatomically correct example above, a browser that would receive this URL (either because the user typed it into the Location field or because he clicked on a link) would run off to the computer called webreference.com, walk up to port 80 and knock politely. The Web server, which is running inside the computer, would open this port and look at the browser, with a look that suggests a strong “Whaddayawant”? kind of message.

The browser would then politely say something like the following:

GET /html/ HTTP/1.1
Host: webreference.com

This is HTTP-Speak for “Hello, I'd like to get the document called /html/ on your server. Oh, by the way, I'm fluent in version 1.1 of the HTTP protocol. And in case you're wondering, I came a-knocking for the host webreference.com; I hope this is it.” Most of this stuff is not of much importance to us HTML authors; the important bit is GET /html/. This is the main part of the HTTP request.

There are a few types of HTTP requests, but GET is by far the most common; it simply says “Give me this document.” We'll take a look at the second most common type of request, POST, later on.

The Host: webreference.com bit is an example of an HTTP header field. Headers are to HTTP requests what meta-information is to HTML; they're not critical, and most of the time they can just be omitted, but they can come in very, very handy. A header is always a name (in this case, Host), followed by a colon (:), followed by the header field's vallue (webreference.com). We'll talk more about headers later on.

index123456

Bison HTML Home