[SAC] Motion: robots.txt, Drupal and performance
hobu at iastate.edu
Thu Mar 29 10:22:45 EDT 2007
We're buried again. Hope this email gets to you :)
>It is my opinion that Drupal is not our problem, svn is. Next time we are
>"up against the wall" I think we should just rename /var/www/svn and see if
>that doesn't bring things back under control (and verify my claim that svn is
>the real pig).
Whomever re-enabled robots.txt has proven what I have suspected...
our Drupal performance is very poor, CPU intensive, and completely
unable to keep up with the load. In the 20 minutes I've watched, we
haven't had any svn activity, and we're still averaging a load of
1.5-2.0. Throw in an FDO checkout and the load jumps to 6+ quickly.
> o I could pull one drupal web page at about 60pages per second
> o I found pulling a non-existant page ran about the same speed.
> o I can pull http://trac.osgeo.org/gdal at about 2400pages/second.
Trac's performance is 40 times better than Drupal in our respective
configurations, and it doesn't have to respond to just about every
request on the OSGeo.org domain. Additionally, Trac doesn't cache
*anything,* and it dynamically generates it all. Admittedly,
Drupal's doing a lot more, but 40 times the load on who knows how
much times the traffic equates to an unsustainable situation.
I move that we disable crawlers on osgeo.org until we can do 500+
pages/sec with Drupal. In our current configuration, even if we
moved svn and trac off to the other server, we would not be able to
keep up with the bots.
I completely agree that a website not on the search engines is
equivalent to practically not existing. I would rather not exist and
have the machine stand up. In the last week, load issues have caused
us to burn through *a lot* of our volunteer admin juice. Someone
needs to step forward to aggressively marshal the Drupal performance
stuff and get us where we need to be. Otherwise we can just leave it
alone and not be on the indexes.
More information about the Sac