[postgis-users] Spatial Join Performance

Carl Anderson carl.anderson at vadose.org
Fri Nov 17 17:20:43 PST 2006


Eric,

Your query(as explained) is using the spatial indexes.

Spatial predicates (intersects, contains, within, ...) are all greatly 
affected by the number of vertices in the geometries being compared.

Try
Select avg(npoints(the_geom)), stddev(npoints(the_geom)) from taxlot
and again for zoning.

More than a couple hundred vertices per geometry on either or both sides 
would definitely slow you down.

Strategies that I have successfully used:

(If stray positives or stray negatives are acceptable)
Build a separate table for taxlots and for zoning each based on 
simplify(the_geom, X)
where X is tuned to your tolerance for false positives and negatives.

(If an exact answer is desired)
Slice each the_geom from into a  gridded table for each layer.
Each the_geom turns into many
    rectangular elements in its interior
and
   mostly rectangular elements when a grid contains a boundary for the_geom
You query would have to be rewritten to account for
    the muliple returns for each original geometries
    mixed results
          your original query assumed only one zoning per taxlot that 
may not really be true.

Tuning only one side does not improve things very much.

let me know if you want scripts to build gridded tables.

C.


Eric Shuman wrote:
> Paul, Brent, etc...
>
> So I have been trying the various work arounds for the TOAST issue, but
> can't seem to force the query to use the spatial index on the table in
> question.  None of the work arounds seem to change anything.  Below are some
> of my stats and the query with query plan.
>
>
> SELECT version();
> "PostgreSQL 8.0.4 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.3.5
> (Debian 1:3.3.5-13)"
>
> SELECT postgis_version();
> "1.0 USE_GEOS=1 USE_PROJ=0 USE_STATS=1"
>
>
> SET enable_seqscan=OFF;
> SET random_page_cost=1;
> SET effective_cache_size=1000; -- Have tried up to 10000000 as well
>
> EXPLAIN
>
> SELECT taxlot,zone_
> FROM  taxlot,zoning
> WHERE
>   taxlot.the_geom && zoning.the_geom
> AND
>   contains(zoning.the_geom,taxlot.the_point_geom)
>
> ---------  Query Plan --------------------------
> "Nested Loop  (cost=100000000.00..100003612.20 rows=62135 width=23)"
> "  Join Filter: contains("outer".the_geom, "inner".the_point_geom)"
> "  ->  Seq Scan on zoning  (cost=100000000.00..100000132.49 rows=1149
> width=1967)"
> "  ->  Index Scan using taxlot_geom_index on taxlot  (cost=0.00..3.01 rows=1
> width=492)"
> "        Index Cond: (taxlot.the_geom && "outer".the_geom)"
> ------------------------------------------------
>
> Some of the things I have also tried:
> -- SELECT AddGeometryColumn('','taxlot','the_point_geom','-1','POINT',2);
> -- update taxlot set the_point_geom = centroid(taxlot.the_geom);
> -- create index point_index on taxlot USING GIST (the_point_geom);
> -- vacuum analyze taxlot;
> -- vacuum analyze zoning;
>
> -- SELECT addGeometryColumn('','zoning','bbox','-1','GEOMETRY','2');
> -- UPDATE zoning set bbox = Envelope(Force_2d(the_geom));
>
>
> Would the fact that I don't have a srid set on the tables matter?
>
>
>
> Thanks again for all the help!
>
> ---------------
> Eric Shuman
>
>
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
>   




More information about the postgis-users mailing list