[postgis-devel] Caching Double-based Boxes

Paul Ramsey pramsey at opengeo.org
Fri Nov 25 21:02:45 PST 2011


Brian,
Thanks so much for taking a run at this... instead of running an
st_intersects() test, which might get bogged by the CPU-intensive
exact calculations could you run a simple bounding box query, with the
&& operator? A few different result set sizes (10, 100, 1000)? Then
I'll feel confident we're going OK.
P.

On Fri, Nov 25, 2011 at 7:50 PM,  <maplabs at light42.com> wrote:
> Super Double-Box Search-Off     25Nov11 -dbb
> ====================================================================
>
> "A quick estimate of performance of PostGIS 2 with
>   geometry bounding boxes as doubles vs as floats"
>
> --
> All tests performed on Linux 2.6.38 x86_64
> PostgreSQL 9.0.5
>   shared_buffers = 2400MB
>   temp_buffers = 64MB
>   work_mem = 512MB
>   maintenance_work_mem = 200MB
>
> --
> The Test  -  TIGER 2010 California
>
> TIGER provides many possibilities for tests.. The one I chose here
> uses PLACE (a large'ish MULTIPOLY) and EDGE (MULTILINE)
>
> A random sampling of 100 EDGES in California shows the
> following characteristics:
> --------------------------
> select st_npoints(the_geom) from ca_edges order by random() limit 100;
>
> min,median,mean,max    stdev
> 2, 5.15, 16.333, 212   34.23
>
> sorted, the first dozen entries are 2, the last are the following
> [ ... 36,37,43,44,51,57,58,68,68,79,206,206,212]
>
> I reasoned that although many EDGES are trivially short, there
> exist substantial numbers of non-trivial EDGES, so they would be ok for a
> test.
> Next, pick some large PLACE geometries.. From the Dozen largest (by area)
> PLACE rows, I picked 4 well-known cities
>
> now, the test (with indexes in place I trust)
>
> ----------
> select count(*) from ca_edges where st_intersects( the_geom,
> (select st_envelope(the_geom) from ca_place where name = 'City_Name') );
> ----------
>
> Results:
>
> place_name,trunk_cnt,trunk_time,dblbox_cnt,dblbox_time
> Bakersfield,39691,24.230,39691,25.275
> Fresno,36527,24.217,36527,25.194
> Oakland,44404,24.395,44404,25.680
> Lancaster,16566,23.983,16566,25.194
>
> -----------------------------------------
> preliminary conclusions
>
> the search times were almost the same for each of the 4 cases
> and I did not detect change with multiple runs of the same query
> (though I could have tested that more, too)
>
> search times were fast enough such that I believe indexes are being used
> (I did not check all indexes carefully, but the database setup was the same
> for both test contexts, so if they missed something, both tests likely
> missed
> in the same way)
>
> the result counts for both trunk and dblbox match, as a sanity check
>
> performance hit is around 4% in each case
>
> =======
>
> caveat: I could have misjudged and chosen a poor test case for some
> obscure-to-me reason.. hopefully, this is useful and informative
>
> best regards from San Francisco, California
>
> Brian Hamlin
> GeoCal
> OSGeo California Chapter
> maplabs at light42.com
>
>
> _______________________________________________
> postgis-devel mailing list
> postgis-devel at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-devel
>



More information about the postgis-devel mailing list