[postgis-users] GIST index speed

Steve Kondik shade at chemlab.org
Fri Jun 6 10:53:24 PDT 2008


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Paul Ramsey wrote:
> Steve,
> 
> Any chance you could share your data with the developers? this is a
> very strange result.
> 
> P
> 

Sure, I exported the table with pgsql2shp.

http://n0rp.chemlab.org/geography.tbz2

There's nothing spectacular about the data, it's just
state/county/zipcode from the US census.


> On Fri, Jun 6, 2008 at 9:37 AM, Steve Kondik <shade at chemlab.org> wrote:
> Steve Kondik wrote:
>>>> Mark Cave-Ayland wrote:
>>>>> Steve Kondik wrote:
>>>>>> -----BEGIN PGP SIGNED MESSAGE-----
>>>>>> Hash: SHA1
>>>>>>
>>>>>> I'm seeing this same kind of slowness.  I am in the process of migrating
>>>>>> an application from Oracle to Postgres.  One query that seems to take
>>>>>> longer than it should is a very simple point-in-bbox query which takes 3
>>>>>> seconds.  There are 32000 rows in the table, and the query is as simple
>>>>>> as it can be:
>>>>>>
>>>>>> select count(*) from geography where type='Z' and centroid && (select
>>>>>> geometry from geography where id=69495);
>>>>>>
>>>>>> The second geography is the USA, so this query gets all zipcodes inside
>>>>>> the bbox, using precalculated centroids.  The query plan looks fine as
>>>>>> well:
>>>>>>
>>>>>>  Aggregate  (cost=16.56..16.57 rows=1 width=0) (actual
>>>>>> time=2407.123..2407.125 rows=1 loops=1)
>>>>>>    InitPlan
>>>>>>      ->  Index Scan using geography_pkey on geography  (cost=0.00..8.28
>>>>>> rows=1 width=4528) (actual time=0.027..0.030 rows=1 loops=1)
>>>>>>            Index Cond: (id = 69495)
>>>>>>    ->  Index Scan using idx_geography_centroid_z on geography
>>>>>> (cost=0.00..8.28 rows=1 width=0) (actual time=1.417..2355.049 rows=29687
>>>>>> loops=1)
>>>>>>          Index Cond: (centroid && $0)
>>>>>>          Filter: (centroid && $0)
>>>>>>  Total runtime: 2407.194 ms
>>>>>> (8 rows)
>>>>>>
>>>>>>
>>>>>> Any thoughts?  This query takes 200ms on Oracle.  It seems like
>>>>>> something is wrong for it to take 3 seconds on PostGIS.  It's faster for
>>>>>> smaller geometries (like a state), but I don't understand why since its
>>>>>> just point-inside-bbox.
>>>>> Hi Steve,
>>>>> Well from the above you can see fairly easily that the time is
>>>>> disappearing into the second index scan. The above output doesn't look
>>>>> quite right somehow (it's missing at least one node indicator) - can you
>>>>> try cutting and pasting the exact output from EXPLAIN into an email
>>>>> again? Also we'll need the output of "\d geography" from within psql.
>>>>
>>>>> ATB,
>>>>> Mark.
>>>>
>>>>
>>>> Sure thing.  It's taking quite a bit longer now because we're doing some
>>>> bulk loading.  When nothing else is running, the query takes about 4
>>>> seconds.
>>>>
>>>>
>>>> health_central_22=> explain analyze select count(*) from geography where
>>>> type='Z' and centroid && (select geometry from geography where id=69495);
>>>>
>>>> QUERY PLAN
>>>>
>>>> -
>>>> ---------------------------------------------------------------------------------------------------------------------------------------------------
>>>>  Aggregate  (cost=16.56..16.57 rows=1 width=0) (actual
>>>> time=16534.739..16534.741 rows=1 loops=1)
>>>>    InitPlan
>>>>      ->  Index Scan using geography_pkey on geography  (cost=0.00..8.28
>>>> rows=1 width=4528) (actual time=0.076..0.080 rows=1 loops=1)
>>>>            Index Cond: (id = 69495)
>>>>    ->  Index Scan using idx_geography_centroid_z on geography
>>>> (cost=0.00..8.28 rows=1 width=0) (actual time=25.453..16192.427
>>>> rows=29687 loops=1)
>>>>          Index Cond: (centroid && $0)
>>>>          Filter: (centroid && $0)
>>>>  Total runtime: 16534.809 ms
>>>> (8 rows)
>>>>
>>>> health_central_22=> \d geography
>>>>              Table "public.geography"
>>>>     Column    |          Type          | Modifiers
>>>> --------------+------------------------+-----------
>>>>  type         | character(1)           | not null
>>>>  id           | bigint                 | not null
>>>>  centroid     | geometry               |
>>>>  geometry     | geometry               |
>>>>  name         | character varying(32)  | not null
>>>>  population   | bigint                 |
>>>>  abbreviation | character varying(2)   |
>>>>  po_name      | character varying(100) |
>>>>  id_geo_state | bigint                 |
>>>> Indexes:
>>>>     "geography_pkey" PRIMARY KEY, btree (id)
>>>>     "idx_geography_2" btree (name)
>>>>     "idx_geography_3" btree (abbreviation)
>>>>     "idx_geography_4" btree (id_geo_state)
>>>>     "idx_geography_centroid" gist (centroid)
>>>>     "idx_geography_centroid_c" gist (centroid) WHERE type = 'C'::bpchar
>>>>     "idx_geography_centroid_s" gist (centroid) WHERE type = 'S'::bpchar
>>>>     "idx_geography_centroid_z" gist (centroid) WHERE type = 'Z'::bpchar
>>>>     "idx_geography_geometry" gist (geometry) CLUSTER
>>>>     "idx_geography_geometry_c" gist (geometry) WHERE type = 'C'::bpchar
>>>>     "idx_geography_geometry_r" gist (geometry) WHERE type = 'R'::bpchar
>>>>     "idx_geography_geometry_s" gist (geometry) WHERE type = 'S'::bpchar
>>>>     "idx_geography_geometry_z" gist (geometry) WHERE type = 'Z'::bpchar
>>>> Foreign-key constraints:
>>>>     "fk_geography_1" FOREIGN KEY (id_geo_state) REFERENCES geography(id)
>>>>
> Adding force_2d speeds this up immensely.  Not sure I understand why
> this is necessary, but I'm still getting used to PostGIS.
> 
> 
> health_central_22=> explain analyze select id,name from geography where
> type='Z' and centroid && (select force_2d(geometry) from geography where
> id=69495);
>                                                                QUERY
> PLAN
> -
> ------------------------------------------------------------------------------------------------------------------------------------------
>  Index Scan using idx_geography_centroid on geography  (cost=8.29..16.57
> rows=1 width=14) (actual time=0.263..125.338 rows=29687 loops=1)
>   Index Cond: (centroid && $0)
>   Filter: ((centroid && $0) AND (type = 'Z'::bpchar))
>   InitPlan
>     ->  Index Scan using geography_pkey on geography  (cost=0.00..8.29
> rows=1 width=4441) (actual time=0.158..0.161 rows=1 loops=1)
>           Index Cond: (id = 69495)
>  Total runtime: 171.229 ms
> (7 rows)
_______________________________________________
postgis-users mailing list
postgis-users at postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users
>>
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFISXmUMrBfzfMVwMcRAk0/AJsHZwP3p8AUVgSvWpPHFoGW/KCnTQCfY1QN
ZpmBnxOzQ3KQX2TG+03yVxM=
=z/t9
-----END PGP SIGNATURE-----



More information about the postgis-users mailing list