Archive for the Category 'server logs'

links for 2008-09-19

Friday, September 19th, 2008

links for 2008-07-17

Thursday, July 17th, 2008

links for 2008-07-09

Wednesday, July 09th, 2008

links for 2008-07-08

Tuesday, July 08th, 2008

links for 2008-06-20

Friday, June 20th, 2008

links for 2008-06-10

Tuesday, June 10th, 2008

Googling for Resumes

Sunday, October 01st, 2006

A few months ago I started getting fairly regular contacts from recruiters. This is weird, because I don’t have an active resume on any job boards.

I was getting about an email a day, but they were for positions in California, Manhattan, Atlanta, etc. so I ignored them. Actually, the first one I got was for a financial services position in Manhattan, and I did work with someone who’s since taken a position down there, so I thought that one might have been from a recommendation from him, but the rest were out of the blue.

Eventually I took a look at my Apache access logs and realized what was going on – Google had picked up the online resume that I keep at http://burtbeckwith.com/resume/. Ordinarily this isn’t linked to from anywhere – my home page is intentionally blank – but I give the URL to recruiters when I’m job searching in response to the standard “I got your resume from Monster but the format is jumbled, can I get a Word version of your resume?” question. When I started this blog, I put a link to the resume on the About page, and the blog got crawled by Google so the resume ended up being searchable. Mystery solved.

The bulk of the hits are from Google searches, although a few are from Lycos. There are several searches in the logs that were clearly not resume searches, such as for these search terms:

  • custom tag libraries for database connection
  • data consistency between mysql web page data jboss server
  • accessing oracle database through tunnel jdbc
  • Spring framework export data to Excel
  • broken jdbc thin oracle firewall
  • synchronize excel server client workbooks
  • jmeter sur le serveur Mysql
  • electropherogram interpretation genemapper
  • oracle Genemapper

Here’s some queries that worked only because of simple keyword matches, but are pretty far off the mark:

  • (oracle performance (dba OR or OR data OR base OR administrator))
  • (((MA OR MD) OR (“account manager”) OR (“business to business” OR b2b) OR (“outside sales”) OR (insurance)))
  • ((c OR C++ OR embedded OR wince OR firmware) (os OR “boot loader” OR internals OR api OR multimedia))
  • ((logistics OR SCM OR jde OR “JD Edwards”) (ERP OR CRM OR enterprise OR relationship))
  • “mass spectrometry” scientist MA
  • “power plant” OR construction OR utility OR boilers

But the most interesting queries use Google’s advanced operators , specifically looking for resume, cv, vitae, etc. in the url or title of the page:

  • (intitle:resume OR inurl:resume OR intitle:cv OR intitle:vitae OR inurl:vitae) and “linux” and “apache” and “java” and “mysql” and “new york”
  • (intitle:resume OR inurl:resume) (“java developer”) objective education (experience OR history)
  • (intitle:resume OR inurl:resume) (Java AJAX Hibernate (sybase OR oracle) (spring OR JMS OR struts))

One recruiter who called told me “there’s nobody good at Monster.com”. Of course he was exaggerating, but his point is valid – there’s a low signal to noise ratio in the job boards. His theory is that the good developers tend to stay where they are because they’re treated well, and the lesser developers churn a lot more. So he basically admitting that he was poaching for good developers who aren’t necessarily looking for a change, but who might be convinced given the right opportunity.

On Lesser HTTP Methods

Monday, September 18th, 2006

I was checking out my web server logs and noticed lots of HEAD and OPTIONS requests. Everybody knows that there are more HTTP methods than just GET and POST, but I realized that I only had a high-level understanding of what the others are for.

The HTTP 1.1 RFC defines 8 methods:

  • OPTIONS
  • GET
  • HEAD
  • POST
  • PUT
  • DELETE
  • TRACE
  • CONNECT

We all know about GET and POST; PUT and DELETE are use to upload and delete resources, and CONNECT is used for “a proxy that can dynamically switch to being a tunnel”, e.g. for SSL. So that leaves OPTIONS, HEAD, TRACE.

HEAD is like GET except that only headers and metadata are returned, no body content. This could be used to check for updated URLs, but in practice all of the web crawlers I’ve seen in my logs do GETs, most likely since the “Last-Modified” header isn’t required and isn’t guaranteed to be correct even if it exists.

So here’s some Jakarta Commons HttpClient code to test out the HEAD method on a sample URL:

HttpClient client = new HttpClient();
client.getHttpConnectionManager().getParams().setConnectionTimeout(5000);
HttpMethod method = new HeadMethod("http://burtbeckwith.com/blog/?p=35");
client.executeMethod(method);
for (Header header : method.getResponseHeaders()) {
  System.out.println(header);
}
System.out.println("\nResponse:\n" + method.getResponseBodyAsString());

and the output:

Date: Mon, 18 Sep 2006 04:35:18 GMT
Server: Apache/2.2.3 (Unix) DAV/2 mod_jk/1.2.18 PHP/5.1.6
X-Powered-By: PHP/5.1.6
Pragma: no-cache
X-Pingback: http://burtbeckwith.com/blog/xmlrpc.php
Status: 200 OK
Content-Type: text/html; charset=UTF-8

Response:
null

The code for an OPTIONS request is very similar:

HttpClient client = new HttpClient();
client.getHttpConnectionManager().getParams().setConnectionTimeout(5000);
HttpMethod method = new OptionsMethod("http://burtbeckwith.com/blog/?p=35");
client.executeMethod(method);
for (Header header : method.getResponseHeaders()) {
  System.out.println(header);
}

and the output:

Date: Mon, 18 Sep 2006 04:37:48 GMT
Server: Apache/2.2.3 (Unix) DAV/2 mod_jk/1.2.18 PHP/5.1.6
X-Powered-By: PHP/5.1.6
Pragma: no-cache
X-Pingback: http://burtbeckwith.com/blog/xmlrpc.php
Status: 200 OK
Transfer-Encoding: chunked
Content-Type: text/html; charset=UTF-8

In this case, the response does have a body, the content of the URL (the same response body as with a GET request). Also note that there’s one more header than from the HEAD request, Transfer-Encoding.

TRACE is used to loopback your request, e.g. for testing:

HttpClient client = new HttpClient();
client.getHttpConnectionManager().getParams().setConnectionTimeout(5000);
HttpMethod method = new TraceMethod("http://burtbeckwith.com/blog/?p=35");
method.setRequestHeader("foo", "bar");
client.executeMethod(method);
for (Header header : method.getResponseHeaders()) {
   System.out.println(header);
}
System.out.println('\n' + method.getResponseBodyAsString());

Here the headers are more brief:

Date: Mon, 18 Sep 2006 04:39:17 GMT
Server: Apache/2.2.3 (Unix) DAV/2 mod_jk/1.2.18 PHP/5.1.6
Transfer-Encoding: chunked
Content-Type: message/http

and the response body echoes back our request. Note that the fake “foo” header is there:

TRACE /blog/?p=35 HTTP/1.1
foo: bar
User-Agent: Jakarta Commons-HttpClient/3.0
Host: burtbeckwith.com

When I was looking for information I stumbled across the original specification for HTTP (v0.9) written in 1991 by Tim Berners-Lee. Cool stuff.

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.