A few weeks ago, I wrote out a series of steps aimed at Fighting Bots Via Their Bad Requests. After watching my logs since then, I’ve noticed I made an incredibly stupid mistake. Bad bots do not follow 301 redirects! What does that mean?
If a regular browser encounters a web page, image, or document and is told by the web server that that item has been moved (via a 301 response code), the web browser will then ask for the item at its new location. That is the redirect. But it is up to the browser to ask for the item at its new location! Bad bots don’t care if the item has been moved, they’re just looking for vulnerabilities to exploit.
So I have removed the R=301 code from my redirect requests. And now if a bad bot asks for something like http://www.planetmike.com/2007/02//include/scripts/export_batch.inc.php?DIR= they will not be redirected to my robot-trap, but instead will be given the robot-trap! Poof! Instantly they are blocked. I was making things more complicated than they needed to be.