[postgis-users] Substantial different index use between 9.5 and 9.6 using Postgis
Bill Measday
bill at measday.com
Thu Dec 1 22:29:47 PST 2016
Thanks Regina,
First, your first suggestion didn't seem to achieve anything - even the
costs in the explain remained the same (1e10).
Second, you lost me a little with your second suggestion. As I gather
_st_contains explicitly avoids the index.
Tried
explain SELECT address_default_geocode_pid,
(SELECT elevation FROM m_elevations e WHERE e.geom &&
ST_SetSRID(ST_MakePoint(longitude, latitude), 4326)),
ST_SetSRID(ST_MakePoint(latitude, longitude), 4283)
FROM address_default_geocode
WHERE latitude = -33.87718472 AND longitude = 151.27544336;
(ie changed WHERE ST_Contains(e.geom, ST_SetSRID(ST_MakePoint(longitude,
latitude), 4326)) to WHERE e.geom && ST_SetSRID(ST_MakePoint(longitude,
latitude), 4326))
and that seems to have done the trick. Tee explain now uses the index
and the actual query seems more or less on par with the version run
under 9.5.
Odd though, but thank you very much for the help/solution. Now I'll try
the full insert query.
Regards
Bill
On 2/12/2016 3:48 PM, Regina Obe wrote:
> I have seen this on rare occasions in past but not recently where the planner doesn't break apart the ST_Contains or ST_Intersects into the component steps.
> We made some changes to costing in 2.3.0, but ST_Contains (and _ST_Contains) wasn't one of them so that doesn't quite explain this.
> Couple of things to try
>
> 1) On the _ST_Contains function in set the cost to something higher like 1000:
>
> CREATE OR REPLACE FUNCTION _st_contains(geom1 geometry, geom2 geometry) RETURNS boolean AS
> '$libdir/postgis-2.3', 'contains'
> LANGUAGE c IMMUTABLE STRICT
> COST 1000;
>
> And then try your query again/
>
>
> 2) If Item 1 doesn't work what happens if you explicitly put in && and _ST_Contains
>
> That should force the index to kick in.
>
>
> -----Original Message-----
> From: postgis-users [mailto:postgis-users-bounces at lists.osgeo.org] On Behalf Of Bill Measday
> Sent: Thursday, December 01, 2016 7:47 PM
> To: postgis-users at lists.osgeo.org
> Subject: [postgis-users] Substantial different index use between 9.5 and 9.6 using Postgis
>
> This is a post I put on the potgres performance list. They suggested I take it up on the PostGis list
>
> Postgres versions 9.5.4 and 9.6.0 running on Windows Server 2012.
> Installed using EnterpriseDB. Both instances are on the same server, postgresql.conf for both are the same except max_locks_per_transaction =
> 200 in 9.6 (caused insertion errors otherwise).
>
> On 9.5:
> geo=# SELECT PostGIS_full_version();
> postgis_full_version
>
> -------------------------------------------------------------------------------------------------------------------------------------------------------------------
> POSTGIS="2.2.2 r14797" GEOS="3.5.0-CAPI-1.9.0 r4090" PROJ="Rel. 4.9.1,
> 04 March 2015" GDAL="GDAL 2.0.2, released 2016/01/26" LIBXML="2.7.8"
> LIBJSON="0.12" RASTER
> (1 row)
>
> On 9.6:
>
> postgis_full_version
>
> -------------------------------------------------------------------------------------------------------------------------------------------------------------------
> POSTGIS="2.3.0 r15146" GEOS="3.5.0-CAPI-1.9.0 r4090" PROJ="Rel. 4.9.1,
> 04 March 2015" GDAL="GDAL 2.1.1, released 2016/07/07" LIBXML="2.7.8"
> LIBJSON="0.12" RASTER
> (1 row)
>
> Databases on the 2 instances are the same (as far as I can tell).
>
> I have 2 relevant tables (created using same script in both instances, except I added an auto increment RID to the 9.6 version as primary key to the larger of the 2 tables - other already had one). One contains a geometry column (geom geometry(1107464) - a polygon) with gist index.
> This table has around 10 billion records. The disks these databases on aren't particularly fast, and indexing took about a week.
> Second table has latitude (numeric(10, 8)), and longitude (numeric(11,
> 8)) and about 10 million records.
>
> The query I'm running is (a part of an insertion into a new table I was trying to run)
> SELECT address_default_geocode_pid,
> (SELECT elevation FROM m_elevations e WHERE ST_Contains(e.geom, ST_SetSRID(ST_MakePoint(longitude, latitude), 4326))),
> ST_SetSRID(ST_MakePoint(latitude, longitude), 4283)
> FROM address_default_geocode;
>
> Under 9.5 the insertion takes about 11 hours. I gave up on 9.6.
>
> I thought I'd try just one record, so:
>
> SELECT address_default_geocode_pid,
> (SELECT elevation FROM m_elevations e WHERE ST_Contains(e.geom, ST_SetSRID(ST_MakePoint(longitude, latitude), 4326))),
> ST_SetSRID(ST_MakePoint(latitude, longitude), 4283)
> FROM address_default_geocode
> WHERE latitude = -33.87718472 AND longitude = 151.27544336;
>
> This returns 3 rows (which is more than the average I'd expect BTW). On
> 9.5 takes a few seconds (3-5) and again I gave up on 9.6
>
> Looking just at the query shown above, I note a difference in explained behaviour. Here is the output from 9.5:
>
> QUERY PLAN
>
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> Seq Scan on address_default_geocode (cost=0.00..37760293.94 rows=1
> width=25)
> Filter: ((latitude = '-33.87718472'::numeric) AND (longitude =
> 151.27544336))
> SubPlan 1
> -> Bitmap Heap Scan on m_elevations e
> (cost=282802.21..37401439.43 rows=3512160 width=8)
> Recheck Cond: (geom ~
> st_setsrid(st_makepoint((address_default_geocode.longitude)::double
> precision, (address_default_geocode.latitude)::double precision), 4326))
> Filter: _st_contains(geom,
> st_setsrid(st_makepoint((address_default_geocode.longitude)::double
> precision, (address_default_geocode.latitude)::double precision), 4326))
> -> Bitmap Index Scan on m_elevations_geom_idx
> (cost=0.00..281924.17 rows=10536480 width=0)
> Index Cond: (geom ~
> st_setsrid(st_makepoint((address_default_geocode.longitude)::double
> precision, (address_default_geocode.latitude)::double precision), 4326))
> (8 rows)
>
> From 9.6
>
> QUERY PLAN
>
> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> Seq Scan on address_default_geocode
> (cost=10000000000.00..23297309357.08 rows=1 width=49)
> Filter: ((latitude = '-33.87718472'::numeric) AND (longitude =
> 151.27544336))
> SubPlan 1
> -> Seq Scan on m_elevations e
> (cost=10000000000.00..13296950520.12 rows=3512159563 width=8)
> Filter: st_contains(geom,
> st_setsrid(st_makepoint((address_default_geocode.longitude)::double
> precision, (address_default_geocode.latitude)::double precision), 4326))
> (5 rows)
>
> Interestingly (change is hard coding of coordinates in second line):
>
> explain SELECT address_default_geocode_pid,
> (SELECT elevation FROM m_elevations e WHERE ST_Contains(e.geom,
> ST_SetSRID(ST_MakePoint(151.27544336, -33.87718472), 4326))),
> ST_SetSRID(ST_MakePoint(latitude, longitude), 4283)
> FROM address_default_geocode
> WHERE latitude = -33.87718472 AND longitude = 151.27544336;
>
> Gives (in 9.6)
>
> QUERY PLAN
>
> --------------------------------------------------------------------------------------------------------
> Seq Scan on address_default_geocode
> (cost=10037428497.36..10037787334.33 rows=1 width=49)
> Filter: ((latitude = '-33.87718472'::numeric) AND (longitude =
> 151.27544336))
> InitPlan 1 (returns $0)
> -> Bitmap Heap Scan on m_elevations e
> (cost=272194.20..37428497.36 rows=3512160 width=8)
> Recheck Cond: (geom ~
> '0101000020E610000036E3976ED0E86240B879C29647F040C0'::geometry)
> Filter: _st_contains(geom,
> '0101000020E610000036E3976ED0E86240B879C29647F040C0'::geometry)
> -> Bitmap Index Scan on m_elevations_geom_idx
> (cost=0.00..271316.16 rows=10536480 width=0)
> Index Cond: (geom ~
> '0101000020E610000036E3976ED0E86240B879C29647F040C0'::geometry)
> (8 rows)
>
> Which looks better.
>
> So for some reason, 9.6 planner decides not to use the index for a small
> number of records returned from address_default_geocode.
> I have vacuum analysed both tables.
> Clearly a sequential scan on 10 billion records is pretty slow (to say
> the least).
>
> Has anyone seen anything like this/got any thoughts?
>
> I tried "set enable_seqscan=false" but didn't seem to have any effect.
>
> Regards
>
> Bill
> _______________________________________________
> postgis-users mailing list
> postgis-users at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/postgis-users
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/postgis-users
More information about the postgis-users
mailing list