[SAC] Motion: robots.txt, Drupal and performance

Howard Butler hobu at iastate.edu
Thu Mar 29 10:22:45 EDT 2007


We're buried again.  Hope this email gets to you :)

>It is my opinion that Drupal is not our problem, svn is.  Next time we are
>"up against the wall" I think we should just rename /var/www/svn and see if
>that doesn't bring things back under control (and verify my claim that svn is
>the real pig).

Whomever re-enabled robots.txt has proven what I have suspected... 
our Drupal performance is very poor, CPU intensive, and completely 
unable to keep up with the load.  In the 20 minutes I've watched, we 
haven't had any svn activity, and we're still averaging a load of 
1.5-2.0.  Throw in an FDO checkout and the load jumps to 6+ quickly.

>  o I could pull one drupal web page at about 60pages per second
>    (http://www.osgeo.org)
>  o I found pulling a non-existant page ran about the same speed.
>  o I can pull http://trac.osgeo.org/gdal at about 2400pages/second.

Trac's performance is 40 times better than Drupal in our respective 
configurations, and it doesn't have to respond to just about every 
request on the OSGeo.org domain.  Additionally, Trac doesn't cache 
*anything,* and it dynamically generates it all.  Admittedly, 
Drupal's doing a lot more, but 40 times the load on who knows how 
much times the traffic equates to an unsustainable situation.

Motion:
I move that we disable crawlers on osgeo.org until we can do 500+ 
pages/sec with Drupal.  In our current configuration, even if we 
moved svn and trac off to the other server, we would not be able to 
keep up with the bots.

I completely agree that a website not on the search engines is 
equivalent to practically not existing.  I would rather not exist and 
have the machine stand up.  In the last week, load issues have caused 
us to burn through *a lot* of our volunteer admin juice.  Someone 
needs to step forward to aggressively marshal the Drupal performance 
stuff and get us where we need to be.  Otherwise we can just leave it 
alone and not be on the indexes.

Howard





More information about the Sac mailing list