Performance of Scanning a Quadtree Index
Frank Warmerdam
warmerdam at POBOX.COM
Wed Aug 29 09:15:25 PDT 2007
Brock Anderson wrote:
> Hi List,
>
> I ran into a curious situation with a quadtree (.qix) index on a
> shapefile. Basically the issue is that performance of scanning the
> index to fetch features is not as good as I expect. Some details:
>
> I have a shapefile data set with 4 million polygons fairly normally
> distributed around British Columbia, Canada. I used 'shptree' to create
> a spatial index on that data. I have a very simple layer in my mapfile
> pointing at the data. Minimal styling, no labels, etc.
>
> I then make a simple WMS request to fetch exactly 1000 features (limited
> by a bbox) from the layer. Mapserver take about 500ms. Seems a bit high.
> Geoserver can draw the exact same data, using the *same .qix* index in
> about 150ms. Naturally I made every effort to keep the comparison
> fair. No reprojection in either case, nearly identical styling, etc.
>
> As a further comparison I noticed that Mapserver and Geoserver are
> nearly equal for fetching/drawing 1000 features from a smaller data set
> of just 10,000. Response time there is more like 120ms.
> So on large shapefile data sets Geoserver's index scanning seems to be
> substantially faster. Are there any map file options that might improve
> performance? Could it be that Geoserver simply has a faster
> implementation for traversing the quadtree?
Brock,
I haven't dug into this, but I think MapServer pulls the features from
the shapefile in the order they are identified traversing the spatial
index. This *could* mean relatively random fetching depending on the
spatial coherence (or lack thereof) of the records in the shapefile.
If (and I haven't actually done any investigation) GeoServer uses the
spatial index to build a list of feature ids, sorts that and then
fetches them in the order they appear in the shapefile then GeoServer
might pull the data in a much more "in order" form from disk which
can give much better read performance.
I stress this is not something I did any investigation into. It is
just a wild theory that might explain it.
BTW, OGR also uses .qix files, and does build one bit list which it
sorts exactly to avoid random fetching.
Best regards,
--
---------------------------------------+--------------------------------------
I set the clouds in motion - turn up | Frank Warmerdam, warmerdam at pobox.com
light and sound - activate the windows | http://pobox.com/~warmerdam
and watch the world go round - Rush | President OSGeo, http://osgeo.org
More information about the MapServer-users
mailing list