Understanding HTTP- Part 1

Understanding HTTP- Part 1

Posted by Brad Wood
Jul 19, 2008 07:53:00 UTC
Many of the low-level technicalities of our life go on right under our nose without us really understanding their inner workings. You drive your car everyday, but do you understand how an internal combustion engine works? You keep your milk cold in the fridge, but do you grasp the physics of why Freon absorbs energy when it becomes a gas? As a mere user, deep understanding of the things you use is generally not necessary, but if you build or maintain one of these systems you had better know what goes on under the hood.The backbone of the world wide web is the HTTP protocol which is based on the REST architecture. Simply put, all the pages of the internet are an individual resource which have a unique identifier. Ex: www.yourserver.com/index.html Communication happens between client and server in a text-based request/response fashion in which one of several predefined methods are used to address the resource in question. You can browse the internet and even make web pages, but if you really want to understand what you do and trouble shoot problems you need to know what goes on behind the scenes. First of all, if you use Firefox, go download the firebug add-on. When it is turned on, click on the "Net" tab to see your HTTP request along with their headers. If you are using Internet Explorer, download Microsoft Fiddler. It isn't an add-on, but a separate program which your web traffic proxies through. It will also show the nitty gritties of what's going on. There's SO much more than I can hope to cover in one blog post, but I'll skim the basics here.

What are you talking about?

Ok, so where are HTTP requests/responses used?
  • When you type google.com into your browser address bar and hit enter
  • When you click the search button in Google
  • When you click on a link in an E-mail that launches your browser
  • When ColdFusion processes a CFHTTP tag
  • Once for each image your browser downloaded to display this web page
  • When a browser makes an Ajax call
Your browser may make dozens off http requests to the server to finish loading all the images, JavaScript files and CSS files for a single page.

How does this thing work?

HTTP requests and responses are both comprised of three main parts:
  • Method and request URI/HTTP Status Code
  • Headers
  • Request Body
The eight methods, or verbs, used by HTTP requests are GET, POST, PUT, DELETE, HEAD, TRACE, OPTIONS, and CONNECT. We'll focus on GET and POST. GET tells the server that the client wishes to retrieve a resource specified by a URI. POST indicates the client has data it wishes the server to do something with. POST has a URI, and optionally form data.

UR-Who?

What are examples of HTTP URIs (Often called URLs)?
  • http://www.googe.com
  • http://www.adobe.com/go/wish/
  • http://forums.sun.com/thread.jspa
  • http://www.codersrevolution.com/images/rssbutton.gif
  • http://i.cdn.turner.com/cnn/.element/css/2.0/common.css

What's my status?

HTTP Status codes tell the client the status of the request. Was it successful? Was the resource found? Did an error occur? Should I look somewhere else? The most common codes are:
  • 200 OK
  • 301/302 Redirect
  • 304 Not Modified (pull from cache)
  • 403 Forbidden
  • 404 Not Found
  • 500 Internal Server Error

Head-where?

Headers contain Meta data about your request and response. In ColdFusion you can look at the headers of a request by dumping out the contents of the GetHttpRequestData() function. Request Headers tell the server things such as:
  • The type/version of the client sending the request:
         User-Agent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1.1.4322; .NET CLR
  • The URI that sent you here:
         Referer: http://en.wikipedia.org/wiki/XMLHttpRequest
  • What types of encoding the client is capable of handling
         Accept-Encoding: gzip, deflate
  • What kind of content is being sent to the server in the request body
         Content-Type: application/x-www-form-urlencoded
  • Cookies stored in the browser for this domain:
          Cookie: CFID=123456; CFTOKEN=12345678;
A lot of the information in the cgi scope comes from the request headers sent to the server. Response Headers tell the client things such as:
  • Type/Version of server software:
         Server: Apache/2.2.4 (Linux/SUSE)
  • How much data is being returned:
         Content-Length: 2908
  • The content type of the response body:
         Content-Type: image/jpeg
  • Instructions to set cookies
         Set-Cookie: CFID=123456
Here is a complete list of HTTP headers.

Body? I'm not hiding any bodies!

The body of an HTTP GET request is usually empty. The server was told everything it needed to know in the headers (Host, and URI). If a form is submitted with a GET method, all the form fields are appended to the query string as part of the URI. The body of an HTTP POST is where the form fields are specified. Of course, there doesn't have to be form fields. A post can also contain XML or any other text as part of the POST. POSTed forms usually come with a urlencoded content type. This means that the form fields will be transferred to the server in the same fashion of a query string: a name=value list delimited by ampersands where special characters are urlencoded with the %hex equivalent. (A space becomes a %20) When I search Ray Camden's blog for the search phrase "Why is Ray & his code so cool?" a POST request is sent with a urlencoded content type. Here is the request body that translates to his form.search variable:
[code]search=Why+is+Ray+%26+his+code+so+cool%3F[/code]
The body of an HTTP response is where the bulk of the information resides. If an HTML page was requested, the body is the HTML text. If the server sends back an XML document, it is found in the response body. If the browser requests an image, the body is the binary data of the image.

So what does all this look like?

Alright, it's time for some examples. The following are the request and resultant response when I navigate to www.google.com:
[code]GET / HTTP/1.1
Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, application/x-shockwave-flash, application/vnd.ms-excel, application/vnd.ms-powerpoint, application/msword, */*
Accept-Language: en-us
UA-CPU: x86
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)
Host: www.google.com
Connection: Keep-Alive
Cookie: PREF=ID=2d849dbec045a436:TM=1215098883:LM=1215098883:S=Lf1rbm7wZm0kxqco[/code]
[code]HTTP/1.1 200 OK
Cache-Control: private, max-age=0
Date: Sat, 19 Jul 2008 05:24:15 GMT
Expires: -1
Content-Type: text/html; charset=UTF-8
Server: gws
Content-Length: 6817

<html><head>...snip...</body></html>
[/code]
My browser then makes a second request for the Google logo at the top of the page:
[code]GET /intl/en_ALL/images/logo.gif HTTP/1.1
Accept: *&#47;*
Referer: http://www.google.com/
Accept-Language: en-us
UA-CPU: x86
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)
Connection: Keep-Alive
Host: www.google.com
Pragma: no-cache
Cookie: PREF=ID=2d849dbec045a436:TM=1215098883:LM=1215098883:S=Lf1rbm7wZm0kxqco[/code]
[code]HTTP/1.1 200 OK
Content-Type: image/gif
Last-Modified: Wed, 07 Jun 2006 19:38:24 GMT
Expires: Sun, 17 Jan 2038 19:14:07 GMT
Cache-Control: public
Date: Sat, 19 Jul 2008 05:27:11 GMT
Server: gws
Content-Length: 8558

GIF89a ... binary image data here...
 [/code]
Here is the request an response for when I searched Ray's site:
[code]POST /search.cfm HTTP/1.1
Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, application/x-shockwave-flash, application/vnd.ms-excel, application/vnd.ms-powerpoint, application/msword, *&#47;*
Referer: http://www.camdenfamily.com/
Accept-Language: en-us
Content-Type: application/x-www-form-urlencoded
UA-CPU: x86
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)
Connection: Keep-Alive
Content-Length: 41
Host: www.coldfusionjedi.com
Pragma: no-cache
Cookie: CFID=12345678

search=Why+is+Ray+%26+his+code+so+cool%3F[/code]
[code]HTTP/1.1 200 OK
Connection: close
Date: Sat, 19 Jul 2008 05:30:53 GMT
Server: Microsoft-IIS/6.0
Content-Type: text/html; charset=UTF-8

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
... snip ...
</body>
</html>
[/code]

Conclusion

HTTP requests and responses are the building blocks that make the web possible. They are simple at their core, but have endless combinations and variations. A good understanding of them is important to grasping what you a building and knowing where to being searching when things go wrong.

 


Dedektif

thanks

Site Updates

Entry Comments

Entries Search