[SAC] Poorly Behaved Spiders, and a Dangerous Trap

Frank Warmerdam warmerdam at pobox.com
Tue Aug 28 21:35:44 EDT 2007


Folks,

We have had serious problems in recent days with load on www.osgeo.org which
I believe relates to our old friend - spiders pulling *huge* subversion
changesets out through Trac.  This is already forbidden by the /robots.txt
so only poorly behaved spiders are doing this.

Per the suggestions at:
   http://www.leekillough.com/robots.html

I have put a "spider trap" into place that should capture the IPs of
spiders ignoring the robots.txt and then use those IPs to forbid further
access to the trac.osgeo.org domain.  Details are in the bug report at:

   http://trac.osgeo.org/osgeo/ticket/140

The IPs are recorded in:

   /var/www/trac/forbidden_ips.txt

Should trac.osgeo.org suddenly stop working for anyone, we should take a
peak in there to see if that is why.

Best regards,
-- 
---------------------------------------+--------------------------------------
I set the clouds in motion - turn up   | Frank Warmerdam, warmerdam at pobox.com
light and sound - activate the windows | http://pobox.com/~warmerdam
and watch the world go round - Rush    | President OSGeo, http://osgeo.org



More information about the Sac mailing list