On Lesser HTTP Methods
I was checking out my web server logs and noticed lots of HEAD and OPTIONS requests. Everybody knows that there are more HTTP methods than just GET and POST, but I realized that I only had a high-level understanding of what the others are for.
The HTTP 1.1 RFC defines 8 methods:
- OPTIONS
- GET
- HEAD
- POST
- PUT
- DELETE
- TRACE
- CONNECT
We all know about GET and POST; PUT and DELETE are use to upload and delete resources, and CONNECT is used for “a proxy that can dynamically switch to being a tunnel”, e.g. for SSL. So that leaves OPTIONS, HEAD, TRACE.
HEAD is like GET except that only headers and metadata are returned, no body content. This could be used to check for updated URLs, but in practice all of the web crawlers I’ve seen in my logs do GETs, most likely since the “Last-Modified” header isn’t required and isn’t guaranteed to be correct even if it exists.
So here’s some Jakarta Commons HttpClient code to test out the HEAD method on a sample URL:
HttpClient client = new HttpClient(); client.getHttpConnectionManager().getParams().setConnectionTimeout(5000); HttpMethod method = new HeadMethod("http://burtbeckwith.com/blog/?p=35"); client.executeMethod(method); for (Header header : method.getResponseHeaders()) { System.out.println(header); } System.out.println("\nResponse:\n" + method.getResponseBodyAsString());
and the output:
Date: Mon, 18 Sep 2006 04:35:18 GMT Server: Apache/2.2.3 (Unix) DAV/2 mod_jk/1.2.18 PHP/5.1.6 X-Powered-By: PHP/5.1.6 Pragma: no-cache X-Pingback: http://burtbeckwith.com/blog/xmlrpc.php Status: 200 OK Content-Type: text/html; charset=UTF-8 Response: null
The code for an OPTIONS request is very similar:
HttpClient client = new HttpClient(); client.getHttpConnectionManager().getParams().setConnectionTimeout(5000); HttpMethod method = new OptionsMethod("http://burtbeckwith.com/blog/?p=35"); client.executeMethod(method); for (Header header : method.getResponseHeaders()) { System.out.println(header); }
and the output:
Date: Mon, 18 Sep 2006 04:37:48 GMT Server: Apache/2.2.3 (Unix) DAV/2 mod_jk/1.2.18 PHP/5.1.6 X-Powered-By: PHP/5.1.6 Pragma: no-cache X-Pingback: http://burtbeckwith.com/blog/xmlrpc.php Status: 200 OK Transfer-Encoding: chunked Content-Type: text/html; charset=UTF-8
In this case, the response does have a body, the content of the URL (the same response body as with a GET request). Also note that there’s one more header than from the HEAD request, Transfer-Encoding
.
TRACE is used to loopback your request, e.g. for testing:
HttpClient client = new HttpClient(); client.getHttpConnectionManager().getParams().setConnectionTimeout(5000); HttpMethod method = new TraceMethod("http://burtbeckwith.com/blog/?p=35"); method.setRequestHeader("foo", "bar"); client.executeMethod(method); for (Header header : method.getResponseHeaders()) { System.out.println(header); } System.out.println('\n' + method.getResponseBodyAsString());
Here the headers are more brief:
Date: Mon, 18 Sep 2006 04:39:17 GMT Server: Apache/2.2.3 (Unix) DAV/2 mod_jk/1.2.18 PHP/5.1.6 Transfer-Encoding: chunked Content-Type: message/http
and the response body echoes back our request. Note that the fake “foo” header is there:
TRACE /blog/?p=35 HTTP/1.1 foo: bar User-Agent: Jakarta Commons-HttpClient/3.0 Host: burtbeckwith.com
When I was looking for information I stumbled across the original specification for HTTP (v0.9) written in 1991 by Tim Berners-Lee. Cool stuff.