[postgis-users] Query runtime comparison for two datasets using '&&' geometric operator

Ted Spradley tspradley at snoogems.com
Tue Sep 29 13:35:43 PDT 2009


Hi All!

An elementary question from a new user.  Please be patient with my
understanding!

I am running the well known query to find adjacent polygons that first
compares
polygon bounding boxes, then uses the distance function to (I think) compare
the 
actual polygon vertices of the result of the bounding box portion of the
query.

Query_1 and Query_2 below are, I believe, identical queries. Query_1 has a
runtime 
of 71.423 ms, and Query_2 a runtime of 36236.674 ms.  

What is the difference between the two queries?  Is the resolution of
Query_1.data,
0.000278, a resolution producing roughly twice as many points as the
resolution of 
Query_2.data, 0.000458 or do I have that backward?

I have found references in the archives to the width of a row.  Is this
likely referring
to the width of the geometry column, meaning number of coordinate pairs? 
How does
one do a count query on a column to find the number of pairs that a row
contains?

Thank you!
Ted S.

Thank you


##################	Query 1   ####################################

Data:
Spatial_Reference_Information:
  Horizontal_Coordinate_System_Definition:
    Geographic:
      Latitude_Resolution: 0.000278
      Longitude_Resolution: 0.000278
      Geographic_Coordinate_Units: Decimal degrees
    Geodetic_Model:
      Horizontal_Datum_Name: North American Datum of 1983
      Ellipsoid_Name: GRS1980
      Semi-major_Axis: 6378137
      Denominator_of_Flattening_Ratio: 298.257222

Query:
project1=# SELECT DISTINCT a.county, a.state
FROM countyp020 AS a, countyp020 AS b
WHERE b.county='Bexar County'
AND b.state='TX'
AND a.wkb_geometry && b.wkb_geometry
AND distance(a.wkb_geometry, b.wkb_geometry) = 0;
                       county                       | state
----------------------------------------------------+-------
 Comal County                                       | TX
 Bandera County                                     | TX
 Guadalupe County                                   | TX
 Atascosa County                                    | TX
 Wilson County                                      | TX
 Kendall County                                     | TX
 Medina County                                      | TX
 Bexar County                                       | TX
(8 rows)

Query Analyse Explain:
project1=# EXPLAIN ANALYSE SELECT DISTINCT a.county, a.state
FROM countyp020 AS a, countyp020 AS b
WHERE b.county='Bexar County'
AND b.state='TX'
AND a.wkb_geometry && b.wkb_geometry
AND distance(a.wkb_geometry, b.wkb_geometry) = 0;
                                                                    QUERY
PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------
 HashAggregate  (cost=20.23..20.24 rows=1 width=45) (actual
time=70.894..70.902 rows=8 loops=1)
   ->  Nested Loop  (cost=4.27..20.22 rows=1 width=45) (actual
time=17.118..70.697 rows=8 loops=1)
         Join Filter: (distance(a.wkb_geometry, b.wkb_geometry) = 0::double
precision)
         ->  Bitmap Heap Scan on countyp020 b  (cost=4.27..11.94 rows=1
width=8285) (actual time=0.080..0.083 rows=1 loops=1)
               Recheck Cond: (county = 'Bexar County'::bpchar)
               Filter: (state = 'TX'::bpchar)
               ->  Bitmap Index Scan on idx_countyp020_county 
(cost=0.00..4.27 rows=2 width=0) (actual time=0.062..0.062 rows=1 loops=1)
                     Index Cond: (county = 'Bexar County'::bpchar)
         ->  Index Scan using countyp020_geom_idx on countyp020 a 
(cost=0.00..8.27 rows=1 width=8330) (actual time=9.529..10.248 rows=10
loops=1)
               Index Cond: (a.wkb_geometry && b.wkb_geometry)
 Total runtime: 71.423 ms
(11 rows)
##################	End Query 1   #################################

##################	Query 2   ####################################

Data:
	<spref>
		<horizsys>
			<geograph>
				<latres>0.000458</latres>
				<longres>0.000458</longres>
				<geogunit>Decimal degrees</geogunit>
			</geograph>
		<geodetic>
				<horizdn>North American Datum of 1983 in the 48 contiguous states, the
District of Columbia, Alaska, Hawaii, Puerto Rico, the Virgin Islands of the
United States, and the Pacific Island Areas.</horizdn>
				<ellips>Geodetic Reference System 80</ellips>
				<semiaxis>6378137</semiaxis>
				<denflat>298257</denflat>
			</geodetic>
		</horizsys>
	</spref>

Query:
tiger=# SELECT DISTINCT a.name, a.statefp
FROM tl_2008_us_county AS a, tl_2008_us_county AS b
WHERE b.name='Bexar'
AND b.statefp='48'
AND a.the_geom && b.the_geom
AND distance(a.the_geom, b.the_geom) = 0;
   name    | statefp
-----------+---------
 Bandera   | 48
 Guadalupe | 48
 Wilson    | 48
 Atascosa  | 48
 Kendall   | 48
 Bexar     | 48
 Comal     | 48
 Medina    | 48
(8 rows)

Query Explain Analyse:
tiger=# EXPLAIN ANALYSE SELECT DISTINCT a.name, a.statefp
FROM tl_2008_us_county AS a, tl_2008_us_county AS b
WHERE b.name='Bexar'
AND b.statefp='48'
AND a.the_geom && b.the_geom
AND distance(a.the_geom, b.the_geom) = 0;
                                                                       
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------
 HashAggregate  (cost=16.56..16.57 rows=1 width=11) (actual
time=36236.168..36236.176 rows=8 loops=1)
   ->  Nested Loop  (cost=0.00..16.56 rows=1 width=11) (actual
time=3.990..36236.075 rows=8 loops=1)
         Join Filter: (distance(a.the_geom, b.the_geom) = 0::double
precision)
         ->  Index Scan using indx_tl_county_name on tl_2008_us_county b 
(cost=0.00..8.27 rows=1 width=154180) (actual time=0.060..0.064 rows=1
loops=1)
               Index Cond: ((name)::text = 'Bexar'::text)
               Filter: ((statefp)::text = '48'::text)
         ->  Index Scan using indx_tl_county_geo on tl_2008_us_county a 
(cost=0.00..8.27 rows=1 width=154191) (actual time=3.259..5.932 rows=10
loops=1)
               Index Cond: (a.the_geom && b.the_geom)
 Total runtime: 36236.674 ms
(9 rows)

tiger=# SET enable_seqscan TO off;		... didn't help

##################	End Query 2   #################################

-----
PGSQL 8.2.4 / (PostGIS Version 1.3 USE_GEOS=1 USE_PROJ=1 USE_STATS=1 )
Proj4 -4.6.1 / gdal 1.6.1 / GD 2.0.34
Linux/Apache/2.2.8 (Fedora) 
-- 
View this message in context: http://www.nabble.com/Query-runtime-comparison-for-two-datasets-using-%27--%27-geometric-operator-tp25670401p25670401.html
Sent from the PostGIS - User mailing list archive at Nabble.com.




More information about the postgis-users mailing list