PDA

View Full Version : Slurp's Gone Freakin' NUTS


SEFL
08-03-2009, 12:45/12:45PM
I'm not sure if anyone else has noticed it, but I've noticed a number of strange requests from 74.6.18.224, which from every check I can reasonably make is Slurp across almost all of the sites I work on.

First of all, I've seen a lot of requests (over 1000 by my estimate in the last 24 hours alone) along the lines of (site root)/processinquiry or (site root)/processcomments. processinquiry and processcomments are actually two values that I assign to querystring fields and therefore do not and have not ever represented pages on a website that I work on.

Second, I've noticed an inordinate amount of requests for 404 pages that really don't make much sense. Here are some of them:

SlurpConfirm404/ncosgray/r89c01wb.htm
SlurpConfirm404/hmtanaka/fallchur/irc.htm
SlurpConfirm404/fromTheDean.htm
SlurpConfirm404/korfinfo/Show.htm

What is all this supposed to accomplish, exactly? (Besides adding unnecessary strain to servers).

BobBobby
08-03-2009, 22:31/10:31PM
SE bots work in strange ways, best to let the big 3 do their own stuff and not worry

g1smd
09-03-2009, 16:48/04:48PM
I never let them 'do their own stuff' especially when they are requesting stuff they should not, or are almost causing a Denial of Service on the server.

SEFL
09-03-2009, 17:41/05:41PM
Neither do I, particularly when we're talking about a dead page request at less than 30 second intervals on average for things that make no sense at all. That's a lot of bandwidth that doesn't need to be wasted.

I could understand one or two such requests to determine how a 404 is handled, but after the 1000th or 2000th or 5000th request, they should have an idea of what the 404 page looks like...especially since the page returns a "404 Not Found" status.

Or am I the only one seeing this?

BobBobby
09-03-2009, 21:50/09:50PM
If it worries you both that much, block them. In 15 Years of being online I find the big 3 eventually sort things out OK. While things don't always make sense to laymen, the big 3 have their reasons.

Bandwith is cheap as chips and you will most likely end up on Witch hunt. You can set your crawl rate for googlebot, not sure about slurp.