[postgis-users] Substantial different index use between 9.5 and 9.6 using Postgis

Bill Measday bill at measday.com
Thu Dec 1 22:44:47 PST 2016


2 things

First, unfortunately the full query doesn't work as in some cases the 
subquery returns more than one row.  Not sure why, as the geom should be 
a rectangle (more or less given the projection, but was created from 4 
points), but maybe right near the boundary there are issues (on 
reflection I guess the north/south edges are not quite parallel).

Second - I forgot to complete part of my response.  Should have said

    Second, you lost me a little with your second suggestion.  As I
    gather _st_contains explicitly avoids the index, so the explain
    still uses the sequential scan.  I assume I'm misunderstanding your
    suggestion.

Rgds

Bill
On 2/12/2016 5:29 PM, Bill Measday wrote:
> Thanks Regina,
>
> First, your first suggestion didn't seem to achieve anything - even 
> the costs in the explain remained the same (1e10).
>
> Second, you lost me a little with your second suggestion.  As I gather 
> _st_contains explicitly avoids the index.
>
> Tried
>
> explain SELECT address_default_geocode_pid,
>     (SELECT elevation FROM m_elevations e WHERE e.geom && 
> ST_SetSRID(ST_MakePoint(longitude, latitude), 4326)),
>     ST_SetSRID(ST_MakePoint(latitude, longitude), 4283)
>      FROM address_default_geocode
>      WHERE latitude = -33.87718472 AND longitude = 151.27544336;
>
> (ie changed WHERE ST_Contains(e.geom, 
> ST_SetSRID(ST_MakePoint(longitude, latitude), 4326)) to WHERE e.geom 
> && ST_SetSRID(ST_MakePoint(longitude, latitude), 4326))
>
> and that seems to have done the trick.  Tee explain now uses the index 
> and the actual query seems more or less on par with the version run 
> under 9.5.
>
> Odd though, but thank you very much for the help/solution.  Now I'll 
> try the full insert query.
>
> Regards
>
> Bill
>
>
> On 2/12/2016 3:48 PM, Regina Obe wrote:
>> I have seen this on rare occasions in past but not recently where the 
>> planner doesn't break apart the ST_Contains or ST_Intersects into the 
>> component steps.
>> We made some changes to costing in 2.3.0, but ST_Contains (and 
>> _ST_Contains) wasn't one of them so that doesn't quite explain this.
>> Couple of things to try
>>
>> 1) On the _ST_Contains function in set the cost to something higher 
>> like 1000:
>>
>> CREATE OR REPLACE FUNCTION _st_contains(geom1 geometry, geom2 
>> geometry) RETURNS boolean AS
>> '$libdir/postgis-2.3', 'contains'
>> LANGUAGE c IMMUTABLE STRICT
>> COST 1000;
>>
>> And then try your query again/
>>
>>
>> 2) If Item 1 doesn't work what happens if you explicitly put in && 
>> and _ST_Contains
>>
>> That should force the index to kick in.
>>
>>
>> -----Original Message-----
>> From: postgis-users [mailto:postgis-users-bounces at lists.osgeo.org] On 
>> Behalf Of Bill Measday
>> Sent: Thursday, December 01, 2016 7:47 PM
>> To: postgis-users at lists.osgeo.org
>> Subject: [postgis-users] Substantial different index use between 9.5 
>> and 9.6 using Postgis
>>
>> This is a post I put on the potgres performance list.  They suggested 
>> I take it up on the PostGis list
>>
>> Postgres versions 9.5.4 and 9.6.0 running on Windows Server 2012.
>> Installed using EnterpriseDB. Both instances are on the same server, 
>> postgresql.conf for both are the same except max_locks_per_transaction =
>> 200 in 9.6 (caused insertion errors otherwise).
>>
>> On 9.5:
>> geo=# SELECT PostGIS_full_version();
>> postgis_full_version
>>
>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------- 
>>
>>    POSTGIS="2.2.2 r14797" GEOS="3.5.0-CAPI-1.9.0 r4090" PROJ="Rel. 
>> 4.9.1,
>> 04 March 2015" GDAL="GDAL 2.0.2, released 2016/01/26" LIBXML="2.7.8"
>> LIBJSON="0.12" RASTER
>> (1 row)
>>
>>    On 9.6:
>>
>> postgis_full_version
>>
>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------- 
>>
>>    POSTGIS="2.3.0 r15146" GEOS="3.5.0-CAPI-1.9.0 r4090" PROJ="Rel. 
>> 4.9.1,
>> 04 March 2015" GDAL="GDAL 2.1.1, released 2016/07/07" LIBXML="2.7.8"
>> LIBJSON="0.12" RASTER
>> (1 row)
>>
>> Databases on the 2 instances are the same (as far as I can tell).
>>
>> I have 2 relevant tables (created using same script in both 
>> instances, except I added an auto increment RID to the 9.6 version as 
>> primary key to the larger of the 2 tables - other already had one). 
>> One contains a geometry column (geom geometry(1107464) - a polygon) 
>> with  gist index.
>> This table has around 10 billion records. The disks these databases 
>> on aren't particularly fast, and indexing took about a week.
>> Second table has latitude (numeric(10, 8)), and longitude (numeric(11,
>> 8)) and about 10 million records.
>>
>> The query I'm running is (a part of an insertion into a new table I 
>> was trying to run)
>>       SELECT address_default_geocode_pid,
>>       (SELECT elevation FROM m_elevations e WHERE ST_Contains(e.geom, 
>> ST_SetSRID(ST_MakePoint(longitude, latitude), 4326))),
>>       ST_SetSRID(ST_MakePoint(latitude, longitude), 4283)
>>        FROM address_default_geocode;
>>
>> Under 9.5 the insertion takes about 11 hours. I gave up on 9.6.
>>
>> I thought I'd try just one record, so:
>>
>> SELECT address_default_geocode_pid,
>>       (SELECT elevation FROM m_elevations e WHERE ST_Contains(e.geom, 
>> ST_SetSRID(ST_MakePoint(longitude, latitude), 4326))),
>>       ST_SetSRID(ST_MakePoint(latitude, longitude), 4283)
>>        FROM address_default_geocode
>>        WHERE latitude = -33.87718472 AND longitude = 151.27544336;
>>
>> This returns 3 rows (which is more than the average I'd expect BTW). On
>> 9.5 takes a few seconds (3-5) and again I gave up on 9.6
>>
>> Looking just at the query shown above, I note a difference in 
>> explained behaviour.  Here is the output from 9.5:
>>
>>       QUERY PLAN
>>
>> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 
>>
>>    Seq Scan on address_default_geocode  (cost=0.00..37760293.94 rows=1
>> width=25)
>>      Filter: ((latitude = '-33.87718472'::numeric) AND (longitude =
>> 151.27544336))
>>      SubPlan 1
>>        ->  Bitmap Heap Scan on m_elevations e
>> (cost=282802.21..37401439.43 rows=3512160 width=8)
>>              Recheck Cond: (geom ~
>> st_setsrid(st_makepoint((address_default_geocode.longitude)::double
>> precision, (address_default_geocode.latitude)::double precision), 4326))
>>              Filter: _st_contains(geom,
>> st_setsrid(st_makepoint((address_default_geocode.longitude)::double
>> precision, (address_default_geocode.latitude)::double precision), 4326))
>>              ->  Bitmap Index Scan on m_elevations_geom_idx
>> (cost=0.00..281924.17 rows=10536480 width=0)
>>                    Index Cond: (geom ~
>> st_setsrid(st_makepoint((address_default_geocode.longitude)::double
>> precision, (address_default_geocode.latitude)::double precision), 4326))
>> (8 rows)
>>
>>   From 9.6
>>
>>       QUERY PLAN
>>
>> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 
>>
>>    Seq Scan on address_default_geocode
>> (cost=10000000000.00..23297309357.08 rows=1 width=49)
>>      Filter: ((latitude = '-33.87718472'::numeric) AND (longitude =
>> 151.27544336))
>>      SubPlan 1
>>        ->  Seq Scan on m_elevations e
>> (cost=10000000000.00..13296950520.12 rows=3512159563 width=8)
>>              Filter: st_contains(geom,
>> st_setsrid(st_makepoint((address_default_geocode.longitude)::double
>> precision, (address_default_geocode.latitude)::double precision), 4326))
>> (5 rows)
>>
>> Interestingly (change is hard coding of coordinates in second line):
>>
>> explain SELECT address_default_geocode_pid,
>>       (SELECT elevation FROM m_elevations e WHERE ST_Contains(e.geom,
>> ST_SetSRID(ST_MakePoint(151.27544336, -33.87718472), 4326))),
>>       ST_SetSRID(ST_MakePoint(latitude, longitude), 4283)
>>        FROM address_default_geocode
>>        WHERE latitude = -33.87718472 AND longitude = 151.27544336;
>>
>> Gives (in 9.6)
>>
>>                                                  QUERY PLAN
>>
>> -------------------------------------------------------------------------------------------------------- 
>>
>>    Seq Scan on address_default_geocode
>> (cost=10037428497.36..10037787334.33 rows=1 width=49)
>>      Filter: ((latitude = '-33.87718472'::numeric) AND (longitude =
>> 151.27544336))
>>      InitPlan 1 (returns $0)
>>        ->  Bitmap Heap Scan on m_elevations e
>> (cost=272194.20..37428497.36 rows=3512160 width=8)
>>              Recheck Cond: (geom ~
>> '0101000020E610000036E3976ED0E86240B879C29647F040C0'::geometry)
>>              Filter: _st_contains(geom,
>> '0101000020E610000036E3976ED0E86240B879C29647F040C0'::geometry)
>>              ->  Bitmap Index Scan on m_elevations_geom_idx
>> (cost=0.00..271316.16 rows=10536480 width=0)
>>                    Index Cond: (geom ~
>> '0101000020E610000036E3976ED0E86240B879C29647F040C0'::geometry)
>> (8 rows)
>>
>> Which looks better.
>>
>> So for some reason, 9.6 planner decides not to use the index for a small
>> number of records returned from address_default_geocode.
>> I have vacuum analysed both tables.
>> Clearly a sequential scan on 10 billion records is pretty slow (to say
>> the least).
>>
>> Has anyone seen anything like this/got any thoughts?
>>
>> I tried "set enable_seqscan=false" but didn't seem to have any effect.
>>
>> Regards
>>
>> Bill
>> _______________________________________________
>> postgis-users mailing list
>> postgis-users at lists.osgeo.org
>> http://lists.osgeo.org/mailman/listinfo/postgis-users
>>
>> _______________________________________________
>> postgis-users mailing list
>> postgis-users at lists.osgeo.org
>> http://lists.osgeo.org/mailman/listinfo/postgis-users
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/postgis-users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20161202/517b98d9/attachment.html>


More information about the postgis-users mailing list