[postgis-users] strange benchmark results

Mark Cave-Ayland mark.cave-ayland at siriusit.co.uk
Sun Dec 27 10:14:28 PST 2009


Andreas wrote:

> Hi folks,
> 
> I'm in the middle of my bachelor thesis in computer science which is all about spatial databases. The main part is benchmarking most of the spatial functions described by OGC simple features on different spatial databases (postgis, ms sql server 2008, ibm db2 gse).
> 
> The last days I ran the benchmark for PostGIS and had a first look on the results. Some test cases show a strange behaviour which I like to share here hoping for some feedback.
> 
> The benchmark consists of 30 queries which are kept simple and mainly testing one OGC function at a time. The main aspect is measuring the query time via JDBC and via the output of "explain analyze". To achieve a better quality the whole benchmark is run multiple times ("rounds").
> 
> The benchmark was run on Windows XP (due to having ms sql server in the benchmark), 2 GB RAM, core 2 duo with default postgres configuration (I overlooked to tune, but will catch up) and having a GIST index on all geometry columns. PostgreSQL 8.4.1 and PostGIS 1.4.0.
> 
> So there is a query like 
> 
> SELECT ST_Union(the_geom) FROM us_county 
> 
> which runs an aggregate union on single polygons of TIGER data.
> 
> The odd result is, that over time (in the course of 100 rounds) the query time gets "shaked up", ranging from 100 seconds in the beginning up to 800 seconds in the end. Please see the first plot [1] for details. x-coordinate shows round n (1-100), y-coordinate the measured query time using EXPLAIN ANALYZE.
> 
> There are other functions which show a similar behaviour (i.e. ST_Within, ST_Touches, ST_Buffer). These results are also shown on [1]. 
> 
> I appreciate any hint and feedback on these results. My (heavily guessed) interpretation of this one is that there may exist kind of leak and something like a garbage collection cleanup up in between.
> 
> I'm quite sure that there is no other influence on the system the benchmark ran on.
> 
> Running the benchmark on a ubuntu 64-bit does not show this behaviour (most results do have a small deviation). This leads me to a more general question: Is it in general advisable to run spatial databases on 64-bit OS? 
> 
> Another general question: Are there more specific tuning advices for postgis (regarding config) than those which apply to Postgres in general?
> 
> If I may provide more information, please let me know. 
> 
> Thanks a lot!
> 
> Best Regards,
> Andreas

Hi Andreas,

You should find that the results should be the same across any platform; 
however you've missed some very important information from your tests. 
The first questions I would like to ask are:

i) Do you re-open a connection for every SELECT command you issue to the 
database, or do you use a new connection? If you don't, does it make a 
difference if you do?

ii) Which versions of PostgreSQL/PostGIS are you using on both Windows 
and Ubuntu? The output of 'SELECT version(), postgis_full_version()' 
from both setups would be useful. In particular, this will confirm which 
versions of GEOS are being used in each setup.


ATB,

Mark.

-- 
Mark Cave-Ayland - Senior Technical Architect
PostgreSQL - PostGIS
Sirius Corporation plc - control through freedom
http://www.siriusit.co.uk
t: +44 870 608 0063

Sirius Labs: http://www.siriusit.co.uk/labs



More information about the postgis-users mailing list