[postgis-users] GIST index speed
Paul Ramsey
pramsey at cleverelephant.ca
Fri Jun 6 10:24:38 PDT 2008
Steve,
Any chance you could share your data with the developers? this is a
very strange result.
P
On Fri, Jun 6, 2008 at 9:37 AM, Steve Kondik <shade at chemlab.org> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Steve Kondik wrote:
>> Mark Cave-Ayland wrote:
>>> Steve Kondik wrote:
>>>> -----BEGIN PGP SIGNED MESSAGE-----
>>>> Hash: SHA1
>>>>
>>>> I'm seeing this same kind of slowness. I am in the process of migrating
>>>> an application from Oracle to Postgres. One query that seems to take
>>>> longer than it should is a very simple point-in-bbox query which takes 3
>>>> seconds. There are 32000 rows in the table, and the query is as simple
>>>> as it can be:
>>>>
>>>> select count(*) from geography where type='Z' and centroid && (select
>>>> geometry from geography where id=69495);
>>>>
>>>> The second geography is the USA, so this query gets all zipcodes inside
>>>> the bbox, using precalculated centroids. The query plan looks fine as
>>>> well:
>>>>
>>>> Aggregate (cost=16.56..16.57 rows=1 width=0) (actual
>>>> time=2407.123..2407.125 rows=1 loops=1)
>>>> InitPlan
>>>> -> Index Scan using geography_pkey on geography (cost=0.00..8.28
>>>> rows=1 width=4528) (actual time=0.027..0.030 rows=1 loops=1)
>>>> Index Cond: (id = 69495)
>>>> -> Index Scan using idx_geography_centroid_z on geography
>>>> (cost=0.00..8.28 rows=1 width=0) (actual time=1.417..2355.049 rows=29687
>>>> loops=1)
>>>> Index Cond: (centroid && $0)
>>>> Filter: (centroid && $0)
>>>> Total runtime: 2407.194 ms
>>>> (8 rows)
>>>>
>>>>
>>>> Any thoughts? This query takes 200ms on Oracle. It seems like
>>>> something is wrong for it to take 3 seconds on PostGIS. It's faster for
>>>> smaller geometries (like a state), but I don't understand why since its
>>>> just point-inside-bbox.
>>
>>> Hi Steve,
>>
>>> Well from the above you can see fairly easily that the time is
>>> disappearing into the second index scan. The above output doesn't look
>>> quite right somehow (it's missing at least one node indicator) - can you
>>> try cutting and pasting the exact output from EXPLAIN into an email
>>> again? Also we'll need the output of "\d geography" from within psql.
>>
>>
>>> ATB,
>>
>>> Mark.
>>
>>
>>
>> Sure thing. It's taking quite a bit longer now because we're doing some
>> bulk loading. When nothing else is running, the query takes about 4
>> seconds.
>>
>>
>> health_central_22=> explain analyze select count(*) from geography where
>> type='Z' and centroid && (select geometry from geography where id=69495);
>>
>> QUERY PLAN
>>
>> -
>> ---------------------------------------------------------------------------------------------------------------------------------------------------
>> Aggregate (cost=16.56..16.57 rows=1 width=0) (actual
>> time=16534.739..16534.741 rows=1 loops=1)
>> InitPlan
>> -> Index Scan using geography_pkey on geography (cost=0.00..8.28
>> rows=1 width=4528) (actual time=0.076..0.080 rows=1 loops=1)
>> Index Cond: (id = 69495)
>> -> Index Scan using idx_geography_centroid_z on geography
>> (cost=0.00..8.28 rows=1 width=0) (actual time=25.453..16192.427
>> rows=29687 loops=1)
>> Index Cond: (centroid && $0)
>> Filter: (centroid && $0)
>> Total runtime: 16534.809 ms
>> (8 rows)
>>
>> health_central_22=> \d geography
>> Table "public.geography"
>> Column | Type | Modifiers
>> --------------+------------------------+-----------
>> type | character(1) | not null
>> id | bigint | not null
>> centroid | geometry |
>> geometry | geometry |
>> name | character varying(32) | not null
>> population | bigint |
>> abbreviation | character varying(2) |
>> po_name | character varying(100) |
>> id_geo_state | bigint |
>> Indexes:
>> "geography_pkey" PRIMARY KEY, btree (id)
>> "idx_geography_2" btree (name)
>> "idx_geography_3" btree (abbreviation)
>> "idx_geography_4" btree (id_geo_state)
>> "idx_geography_centroid" gist (centroid)
>> "idx_geography_centroid_c" gist (centroid) WHERE type = 'C'::bpchar
>> "idx_geography_centroid_s" gist (centroid) WHERE type = 'S'::bpchar
>> "idx_geography_centroid_z" gist (centroid) WHERE type = 'Z'::bpchar
>> "idx_geography_geometry" gist (geometry) CLUSTER
>> "idx_geography_geometry_c" gist (geometry) WHERE type = 'C'::bpchar
>> "idx_geography_geometry_r" gist (geometry) WHERE type = 'R'::bpchar
>> "idx_geography_geometry_s" gist (geometry) WHERE type = 'S'::bpchar
>> "idx_geography_geometry_z" gist (geometry) WHERE type = 'Z'::bpchar
>> Foreign-key constraints:
>> "fk_geography_1" FOREIGN KEY (id_geo_state) REFERENCES geography(id)
>>
>
> Adding force_2d speeds this up immensely. Not sure I understand why
> this is necessary, but I'm still getting used to PostGIS.
>
>
> health_central_22=> explain analyze select id,name from geography where
> type='Z' and centroid && (select force_2d(geometry) from geography where
> id=69495);
> QUERY
> PLAN
> -
> ------------------------------------------------------------------------------------------------------------------------------------------
> Index Scan using idx_geography_centroid on geography (cost=8.29..16.57
> rows=1 width=14) (actual time=0.263..125.338 rows=29687 loops=1)
> Index Cond: (centroid && $0)
> Filter: ((centroid && $0) AND (type = 'Z'::bpchar))
> InitPlan
> -> Index Scan using geography_pkey on geography (cost=0.00..8.29
> rows=1 width=4441) (actual time=0.158..0.161 rows=1 loops=1)
> Index Cond: (id = 69495)
> Total runtime: 171.229 ms
> (7 rows)
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.6 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
>
> iD8DBQFISWe8MrBfzfMVwMcRAkUBAJ9uLeK7kLotz6+CUuKd2Tg5pNj2XwCeLVXV
> RlUY5e2Dwyx1EWa3ln0Ky6M=
> =B0/W
> -----END PGP SIGNATURE-----
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
>
More information about the postgis-users
mailing list