[postgis-users] Query runtime comparison for two datasets using '&&' geometric operator

Ted Spradley tspradley at snoogems.com
Tue Sep 29 14:51:27 PDT 2009


Thank you Paul!

 Total runtime: 4673.221 ms for Query_2 using st_intersects().

I'll work on the upgrades.  

Ted


Paul Ramsey-3 wrote:
> 
> Your rows in the second query do look a lot wider.
> 
> (a) if you can, upgrade to the latest pgsql 8.4 and postgis 1.4,
> there's big speed benefits to both
> (b) if you have even a moderately recent postgis, try using
> 
> st_intersects(a.the_geom, b.the_geom)
> 
> instead of && and distance() = 0
> 
> again, there's more power to be had there.
> 
> Paul
> 
> On Tue, Sep 29, 2009 at 1:35 PM, Ted Spradley <tspradley at snoogems.com>
> wrote:
>>
>> Hi All!
>>
>> An elementary question from a new user.  Please be patient with my
>> understanding!
>>
>> I am running the well known query to find adjacent polygons that first
>> compares
>> polygon bounding boxes, then uses the distance function to (I think)
>> compare
>> the
>> actual polygon vertices of the result of the bounding box portion of the
>> query.
>>
>> Query_1 and Query_2 below are, I believe, identical queries. Query_1 has
>> a
>> runtime
>> of 71.423 ms, and Query_2 a runtime of 36236.674 ms.
>>
>> What is the difference between the two queries?  Is the resolution of
>> Query_1.data,
>> 0.000278, a resolution producing roughly twice as many points as the
>> resolution of
>> Query_2.data, 0.000458 or do I have that backward?
>>
>> I have found references in the archives to the width of a row.  Is this
>> likely referring
>> to the width of the geometry column, meaning number of coordinate pairs?
>> How does
>> one do a count query on a column to find the number of pairs that a row
>> contains?
>>
>> Thank you!
>> Ted S.
>>
>> Thank you
>>
>>
>> ##################      Query 1   ####################################
>>
>> Data:
>> Spatial_Reference_Information:
>>  Horizontal_Coordinate_System_Definition:
>>    Geographic:
>>      Latitude_Resolution: 0.000278
>>      Longitude_Resolution: 0.000278
>>      Geographic_Coordinate_Units: Decimal degrees
>>    Geodetic_Model:
>>      Horizontal_Datum_Name: North American Datum of 1983
>>      Ellipsoid_Name: GRS1980
>>      Semi-major_Axis: 6378137
>>      Denominator_of_Flattening_Ratio: 298.257222
>>
>> Query:
>> project1=# SELECT DISTINCT a.county, a.state
>> FROM countyp020 AS a, countyp020 AS b
>> WHERE b.county='Bexar County'
>> AND b.state='TX'
>> AND a.wkb_geometry && b.wkb_geometry
>> AND distance(a.wkb_geometry, b.wkb_geometry) = 0;
>>                       county                       | state
>> ----------------------------------------------------+-------
>>  Comal County                                       | TX
>>  Bandera County                                     | TX
>>  Guadalupe County                                   | TX
>>  Atascosa County                                    | TX
>>  Wilson County                                      | TX
>>  Kendall County                                     | TX
>>  Medina County                                      | TX
>>  Bexar County                                       | TX
>> (8 rows)
>>
>> Query Analyse Explain:
>> project1=# EXPLAIN ANALYSE SELECT DISTINCT a.county, a.state
>> FROM countyp020 AS a, countyp020 AS b
>> WHERE b.county='Bexar County'
>> AND b.state='TX'
>> AND a.wkb_geometry && b.wkb_geometry
>> AND distance(a.wkb_geometry, b.wkb_geometry) = 0;
>>                                                                    QUERY
>> PLAN
>> ---------------------------------------------------------------------------------------------------------------------------------------------------
>>  HashAggregate  (cost=20.23..20.24 rows=1 width=45) (actual
>> time=70.894..70.902 rows=8 loops=1)
>>   ->  Nested Loop  (cost=4.27..20.22 rows=1 width=45) (actual
>> time=17.118..70.697 rows=8 loops=1)
>>         Join Filter: (distance(a.wkb_geometry, b.wkb_geometry) =
>> 0::double
>> precision)
>>         ->  Bitmap Heap Scan on countyp020 b  (cost=4.27..11.94 rows=1
>> width=8285) (actual time=0.080..0.083 rows=1 loops=1)
>>               Recheck Cond: (county = 'Bexar County'::bpchar)
>>               Filter: (state = 'TX'::bpchar)
>>               ->  Bitmap Index Scan on idx_countyp020_county
>> (cost=0.00..4.27 rows=2 width=0) (actual time=0.062..0.062 rows=1
>> loops=1)
>>                     Index Cond: (county = 'Bexar County'::bpchar)
>>         ->  Index Scan using countyp020_geom_idx on countyp020 a
>> (cost=0.00..8.27 rows=1 width=8330) (actual time=9.529..10.248 rows=10
>> loops=1)
>>               Index Cond: (a.wkb_geometry && b.wkb_geometry)
>>  Total runtime: 71.423 ms
>> (11 rows)
>> ##################      End Query 1   #################################
>>
>> ##################      Query 2   ####################################
>>
>> Data:
>>        <spref>
>>                <horizsys>
>>                        <geograph>
>>                                <latres>0.000458</latres>
>>                                <longres>0.000458</longres>
>>                                <geogunit>Decimal degrees</geogunit>
>>                        </geograph>
>>                <geodetic>
>>                                <horizdn>North American Datum of 1983 in
>> the 48 contiguous states, the
>> District of Columbia, Alaska, Hawaii, Puerto Rico, the Virgin Islands of
>> the
>> United States, and the Pacific Island Areas.</horizdn>
>>                                <ellips>Geodetic Reference System
>> 80</ellips>
>>                                <semiaxis>6378137</semiaxis>
>>                                <denflat>298257</denflat>
>>                        </geodetic>
>>                </horizsys>
>>        </spref>
>>
>> Query:
>> tiger=# SELECT DISTINCT a.name, a.statefp
>> FROM tl_2008_us_county AS a, tl_2008_us_county AS b
>> WHERE b.name='Bexar'
>> AND b.statefp='48'
>> AND a.the_geom && b.the_geom
>> AND distance(a.the_geom, b.the_geom) = 0;
>>   name    | statefp
>> -----------+---------
>>  Bandera   | 48
>>  Guadalupe | 48
>>  Wilson    | 48
>>  Atascosa  | 48
>>  Kendall   | 48
>>  Bexar     | 48
>>  Comal     | 48
>>  Medina    | 48
>> (8 rows)
>>
>> Query Explain Analyse:
>> tiger=# EXPLAIN ANALYSE SELECT DISTINCT a.name, a.statefp
>> FROM tl_2008_us_county AS a, tl_2008_us_county AS b
>> WHERE b.name='Bexar'
>> AND b.statefp='48'
>> AND a.the_geom && b.the_geom
>> AND distance(a.the_geom, b.the_geom) = 0;
>>
>> QUERY PLAN
>> ----------------------------------------------------------------------------------------------------------------------------------------------------------
>>  HashAggregate  (cost=16.56..16.57 rows=1 width=11) (actual
>> time=36236.168..36236.176 rows=8 loops=1)
>>   ->  Nested Loop  (cost=0.00..16.56 rows=1 width=11) (actual
>> time=3.990..36236.075 rows=8 loops=1)
>>         Join Filter: (distance(a.the_geom, b.the_geom) = 0::double
>> precision)
>>         ->  Index Scan using indx_tl_county_name on tl_2008_us_county b
>> (cost=0.00..8.27 rows=1 width=154180) (actual time=0.060..0.064 rows=1
>> loops=1)
>>               Index Cond: ((name)::text = 'Bexar'::text)
>>               Filter: ((statefp)::text = '48'::text)
>>         ->  Index Scan using indx_tl_county_geo on tl_2008_us_county a
>> (cost=0.00..8.27 rows=1 width=154191) (actual time=3.259..5.932 rows=10
>> loops=1)
>>               Index Cond: (a.the_geom && b.the_geom)
>>  Total runtime: 36236.674 ms
>> (9 rows)
>>
>> tiger=# SET enable_seqscan TO off;              ... didn't help
>>
>> ##################      End Query 2   #################################
>>
>> -----
>> PGSQL 8.2.4 / (PostGIS Version 1.3 USE_GEOS=1 USE_PROJ=1 USE_STATS=1 )
>> Proj4 -4.6.1 / gdal 1.6.1 / GD 2.0.34
>> Linux/Apache/2.2.8 (Fedora)
>> --
>> View this message in context:
>> http://www.nabble.com/Query-runtime-comparison-for-two-datasets-using-%27--%27-geometric-operator-tp25670401p25670401.html
>> Sent from the PostGIS - User mailing list archive at Nabble.com.
>>
>> _______________________________________________
>> postgis-users mailing list
>> postgis-users at postgis.refractions.net
>> http://postgis.refractions.net/mailman/listinfo/postgis-users
>>
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
> 
> 


-----
PGSQL 8.2.4 / (PostGIS Version 1.3 USE_GEOS=1 USE_PROJ=1 USE_STATS=1 )
Proj4 -4.6.1 / gdal 1.6.1 / GD 2.0.34
Linux/Apache/2.2.8 (Fedora) 
-- 
View this message in context: http://www.nabble.com/Query-runtime-comparison-for-two-datasets-using-%27--%27-geometric-operator-tp25670401p25671569.html
Sent from the PostGIS - User mailing list archive at Nabble.com.




More information about the postgis-users mailing list