[postgis-users] strange benchmark results

Andreas ml at 3.141592654.de
Fri Dec 25 07:54:15 PST 2009


Hi folks,

I'm in the middle of my bachelor thesis in computer science which is all about spatial databases. The main part is benchmarking most of the spatial functions described by OGC simple features on different spatial databases (postgis, ms sql server 2008, ibm db2 gse).

The last days I ran the benchmark for PostGIS and had a first look on the results. Some test cases show a strange behaviour which I like to share here hoping for some feedback.

The benchmark consists of 30 queries which are kept simple and mainly testing one OGC function at a time. The main aspect is measuring the query time via JDBC and via the output of "explain analyze". To achieve a better quality the whole benchmark is run multiple times ("rounds").

The benchmark was run on Windows XP (due to having ms sql server in the benchmark), 2 GB RAM, core 2 duo with default postgres configuration (I overlooked to tune, but will catch up) and having a GIST index on all geometry columns. PostgreSQL 8.4.1 and PostGIS 1.4.0.

So there is a query like 

SELECT ST_Union(the_geom) FROM us_county 

which runs an aggregate union on single polygons of TIGER data.

The odd result is, that over time (in the course of 100 rounds) the query time gets "shaked up", ranging from 100 seconds in the beginning up to 800 seconds in the end. Please see the first plot [1] for details. x-coordinate shows round n (1-100), y-coordinate the measured query time using EXPLAIN ANALYZE.

There are other functions which show a similar behaviour (i.e. ST_Within, ST_Touches, ST_Buffer). These results are also shown on [1]. 

I appreciate any hint and feedback on these results. My (heavily guessed) interpretation of this one is that there may exist kind of leak and something like a garbage collection cleanup up in between.

I'm quite sure that there is no other influence on the system the benchmark ran on.

Running the benchmark on a ubuntu 64-bit does not show this behaviour (most results do have a small deviation). This leads me to a more general question: Is it in general advisable to run spatial databases on 64-bit OS? 

Another general question: Are there more specific tuning advices for postgis (regarding config) than those which apply to Postgres in general?

If I may provide more information, please let me know. 

Thanks a lot!

Best Regards,
Andreas

[1] http://students.fim.uni-passau.de/~brandlan/ba/benchmark/postgis-01/timeline_postgres_index.html



More information about the postgis-users mailing list