Just wanted to put a quick post out there. I host about 50 trac environments on an EC2 server, several rather large ones of which are public and pretty high traffic. Yesterday the trac server started having issues, and most of the logged errors had a seemingly random type of python trace data, followed by:
IOError: failed to write data
I dug around online for a bit but did not see much in the way of help for this issue, then I noticed that nearly all the errors were showing from a client that started with 65.55.X.X. Flipping through other logs, it appeared to be an MSN bot agent hitting Trac pages, and in some cases even hitting an infinite recursion (somehow linking to /report2/report2/report2/report2/……). The quick and dirty fix to solve this issue was to block the entire subnet from accessing my vhosts, which was done with this little gem inside of each vhost <directory> tag in Apache:
Order Allow,Deny Allow from all Deny from 22.214.171.124/16
Granted this is not a perfect solution, but keeping the site up for real users was more important at the moment than allowing a rampant recursive spider to keep slamming the site as soon as I brought it back up (for nearly 2 days straight now).
Comments on: "MSN bot was smashing Trac server, here is how I blocked it" (8)
Wouldn’t it be better to prevent the bot from getting ahold of your Apache server at all by blocking the IP range?
iptables -A INPUT -s 126.96.36.199/24 -j DROP
I think something like that would work, and probably provide improved performance. (If you were feeling particularly mean you could even TARPIT them, which would be fun. ^_^ )
Er, by blocking the IP range with iptables, I mean.
Oh, and you’d want to use /16 – as you’ve used in your example. (Sorry for the spam. This is making me think I should add an ‘edit’ button to my blog, heh.)
Ragona – thanks for the comment!
I would have done it with iptables, but this is Windows (I know, everyone groan at me at once!). I was also thinking of blocking it with Windows Firewall, but this server is an Amazon EC2 instance, and I had this worry in the back of my head telling me that I may lose any and all access to my server if I enabled that service and it had RDP blocked by default =)
I checked out the security config with Amazon as well, but I did not see any ‘deny’ type rule that I could configure. I believe the security is basically ‘deny all’ by default, and add entries for accepting only what I want, which would make it rough to allow all IP ranges *except* this one.
Thanks for the idea’s, keep em coming!
Hehe. Whatever works, I think your solution totally makes sense given the setup, and I actually didn’t know you could deny IP ranges from Apache, so that’s good to know anyway. 🙂
I have an older instance of Apache that I cannot graceful, so restarting during a blitz like this is impossible to do. As a quick fix I place a line in the Application.cf* to block the specific IP or range –
Same here, msnbot regularly have strange behaviour, hitting my server like 3 requests/second , sometimes more.
Each time this happens on a new server I just iptables -A INPUT -s 188.8.131.52/24 -j DROP on the whole server . . .
I m not surprised msnbot/bing is full of bugs like windows and other microsh1t products . . .
Bookkeeping Small Business…
MSN bot was smashing Trac server, here is how I blocked it – Alagad Ally…