Eric Hodel (drbrain) wrote,
Eric Hodel
drbrain

Referrer Spam

I caught somebody in spamming my Referrer logs from China, and doing a rather poor job of it. Their robot used case-insensitive URLs, and a UA that doesn't exist. Strangely enough, they were spamming for A Virtual Tour of Saint Thomas Church.

I was kind enough to send them an email explaining what was going on. I really doubt a church wants people promoting their website in this fashion.

--- begin email text ---

I caught whoever is using promoting or advertising your site spamming my website trying to fill in my Referrer logs. I've attached a full log of these indiscresions, but I'll give you just a summary of them, and an explanation of why a real web browser user couldn't have made these requests. (Note that I doubt that anybody in your church did this, read on for more information).

First off, when a web browser user goes from page to page following links, they send the webserver the address of the last page they were on. If I were to follow a link from http://example.com/index.html to http://example.com/widgets/index.html, example.com's web server will be told that I came from http://example.com/index.html.

The people who create web sites use these Referrer logs (yes, somebody made a spelling error, and it is now too late to fix it) to determine where their traffic comes from. Typically, they get turned into a nice web page like this one: http://segment7.net/usage/ so that the person who runs the website can get a nice overview of how their site is being used.

Occaisionally, these usage pages are left open to search engine spiders (mine is not). When they are left open to search engine spiders, by spamming the referrer logs with refernces to another site, that site will get an artificial boost in the web site rankings.

Here is what I found in my referrer logs:

host: 211.152.14.98 got page: http://segment7.net/projects/ruby/drb/drbssl/ from page: http://saintthomaschurch.org/tour00.html using the web browser: "Mozilla/4.0 (compatible; MSIE 5.00; Windows 98"

This means that a web browser user from China got a page that doesn't exist on my site, and points to your tour page, with a web browser that doesn't exist.

Let me break this down a little better.

I have a page on my site at http://segment7.net/projects/ruby/drb/DRbSSL/, but not at /projects/ruby/drb/drbssl/ (the capitalization of DRbSSL is important).

They came from the start of the virtual tour of your church. But there is no link from that page to anywhere in my site, so this is obviously faked.

The web browser they are claiming to use doesn't actually exist. The real Internet Explorer 5 has matching parentheses.

And finally, the requests all come from a section of the internet located in China. While it is entirely possible (and likely) that there are people in China interested both in your church and in DRb over SSL, the lack of a link from your page to mine indicates that this is a faked request.

I would talk with the company promoting or publishing your web site, and ask them about these logs.

--- end email text ---

By the way, here is a three entries from the logs, (45 entries total) that are incorrectly capitalized. (And I should have mentioned that they all happened in a few minutes.)

211.152.14.98 - - [28/Apr/2004:02:36:17 -0700] "GET /projects/ruby/drb/drbssl/ HTTP/1.1" 404 1015 "http://saintthomaschurch.org/tour00.html" "Mozilla/4.0 (compatible; MSIE 5.00; Windows 98"

211.152.14.98 - - [28/Apr/2004:02:36:31 -0700] "GET /projects/freebsd/kerberos.html HTTP/1.1" 404 1015 "http://saintthomaschurch.org/tour00.html" "Mozilla/4.0 (compatible; MSIE 5.00; Windows 98"

211.152.14.98 - - [28/Apr/2004:02:36:58 -0700] "GET /sitemap.html HTTP/1.1" 404 1015 "http://saintthomaschurch.org/tour00.html" "Mozilla/4.0 (compatible; MSIE 5.00; Windows 98"

Subscribe
  • Post a new comment

    Error

    default userpic

    Your reply will be screened

    Your IP address will be recorded 

    When you submit the form an invisible reCAPTCHA check will be performed.
    You must follow the Privacy Policy and Google Terms of use.
  • 3 comments