August 25th, 2004


Web Log Analysis

I was flipping through my raw logs last night because I had a huge spike in my traffic logs (about 300 pages above the monthly average). I didn't find much amiss, but I found a few useless spiders to block (larbin and Exabot).

While I was there I checked on how people found some of my less-common files, like my resume. One of the queries was something like "paccar database use" from Rockford, Il DSL line.

This got me thinking about how an ordinary web master would ever find these things without hand-combing the logs. So far I only run around 20,000 hits a month which doesn't get me a big variety of search terms (under 1,000), but larger sites will have a much wider array making a hand search more difficult.

Another thing that's hard to see with the traditional log analyzer is how popular a new page is. If I add a new page, unless its *really* popular and ranks well in the search engines it won't catch up in hits to the rest of the site until a new month.

I thought it would be really neat if you could see something like the Google Zeitgeist for your site on a daily/weekly/quad-weekly basis. It would show things like the top pages and search terms by increasing and declining popularity, search terms that hadn't been seen for the past week, etc. (Of course, if your site was big enough you could do it hourly too.)

I looked around the web, but didn't find anything like this, it seems that most log analysis packages are nothing more than a linear progression of features from analog and The Webalizer.

Fortunately I have the skills to write such a thing (not that it will be terribly difficult) and have begun to do so.

PS: I don't have a good name for the tool. Maybe I should name it after some Ruby mine.
  • Current Music
    Dish Washer

Rockets Ruin Weather!

You can blame NASA for the peculiar weather

Reminds me of this 1920 editorial from the NYT:

A Severe Strain on the Credulity

As a method of sending a missile to the higher, and even to the highest parts of the earth's atmospheric envelope, Professor Goddard's rocket is a practicable and therefore promising device. It is when one considers the multiple-charge rocket as a traveler to the moon that one begins to doubt ... for after the rocket quits our air and really starts on its journey, its flight would be neither accelerated nor maintained by the explosion of the charges it then might have left. Professor Goddard, with his "chair" in Clark College and countenancing of the Smithsonian Institution, does not know the relation of action to re-action, and of the need to have something better than a vacuum against which to react ... Of course he only seems to lack the knowledge ladled out daily in high schools.