[postgis-users] Bad plan without && operator

Paul Ramsey pramsey at cleverelephant.ca
Thu Dec 7 06:48:05 PST 2017


Are you sure that's what SET STATISTICS means? I thought it was the
size of the sample drawn to build the statistics. The histogram size
is somewhat hardcoded,
https://github.com/postgis/postgis/blob/svn-trunk/postgis/gserialized_estimate.c#L1446-L1453

P.

On Thu, Dec 7, 2017 at 12:37 AM, Darafei "Komяpa" Praliaskouski
<me at komzpa.net> wrote:
> Hi Nicolas,
>
> I've seen that behavior for tables that have same, rather sparse but
> worldwide coverage for points and polygons.
> In my case it was due to statistic histogram having not enough resolution to
> properly estimate selectivity.
> When you add more && to request, you basically replace selectivity by
> selectivity^2 in planner.
>
> Another way to change this may be
>
> ALTER TABLE tmp.tp1 SET STATISTICS 10000;
> ALTER TABLE ref.batimet SET STATISTICS 10000;
> ANALYZE tmp.tp1;
> ANALYZE ref.batimet;
>
> By default table is estimated by histogram of 100 cells, basically grid
> 10x10. 10000 changes it to 100x100, allowing
>
> чт, 7 дек. 2017 г. в 11:24, Nicolas Ribot <nicolas.ribot at gmail.com>:
>>
>> Same with a logged table.
>>
>> It seems the && condition have to be duplicated for the right plan to be
>> chosen:
>>
>> EXPLAIN SELECT p.id, b.id
>> FROM tmp.tp1 p join ref.batiment b
>>     on b.geom && p.geom and p.geom && b.geom and _st_intersects(b.geom,
>> p.geom);
>>
>>
>> On 6 December 2017 at 22:36, Paul Ramsey <pramsey at cleverelephant.ca>
>> wrote:
>>>
>>> If you change the small table into a normal table (not unlogged) does
>>> the problem go away? Could be an interesting side effect of
>>> "unlogged"?
>>>
>>> On Wed, Dec 6, 2017 at 1:14 PM, Nicolas Ribot <nicolas.ribot at gmail.com>
>>> wrote:
>>> > Yes, tables are analyzed.
>>> >
>>> > I tried reverting argument, same plan generated:
>>> >
>>> > EXPLAIN SELECT p.id, b.id
>>> > FROM tmp.tp1 p join ref.batiment b on st_intersects(p.geom, b.geom);
>>> >
>>> > QUERY PLAN
>>> > Nested Loop  (cost=0.15..21819347.50 rows=50019 width=35)
>>> >   ->  Seq Scan on batiment b  (cost=0.00..2238826.04 rows=45794904
>>> > width=193)
>>> >   ->  Index Scan using tp1_geom_gist on tp1 p  (cost=0.15..0.42 rows=1
>>> > width=276)
>>> >         Index Cond: (geom && b.geom)
>>> >         Filter: _st_intersects(geom, b.geom)
>>> >
>>> > Nico
>>> >
>>> > On 6 December 2017 at 19:14, Paul Ramsey <pramsey at cleverelephant.ca>
>>> > wrote:
>>> >>
>>> >> Just invert the arguments in your call to ST_Intersects().
>>> >> I don't think we have any way to tell the planner anything about which
>>> >> side of the condition should drive the nested loop join. And yet it
>>> >> seems to get the "right" one most of the time. I assume you've already
>>> >> run 'ANALYZE' and the stats are all up-to-date.
>>> >> P.
>>> >>
>>> >> On Wed, Dec 6, 2017 at 9:57 AM, Nicolas Ribot
>>> >> <nicolas.ribot at gmail.com>
>>> >> wrote:
>>> >> > Hi,
>>> >> >
>>> >> > I have 2 tables, indexed and analyzed.
>>> >> > One big (batiment) with 45M records, one small (tmp.tp1), with 83k
>>> >> > records
>>> >> > (UNLOGGED table)
>>> >> >
>>> >> > When performing a spatial join, the planner chooses the smallest
>>> >> > table
>>> >> > for
>>> >> > seq scan only if I add a && condition with st_intersects:
>>> >> >
>>> >> > EXPLAIN SELECT p.id, b.id
>>> >> > FROM tmp.tp1 p join ref.batiment b on st_intersects(b.geom, p.geom);
>>> >> >
>>> >> > QUERY PLAN
>>> >> > Nested Loop  (cost=0.15..21819347.50 rows=50019 width=35)
>>> >> >   ->  Seq Scan on batiment b  (cost=0.00..2238826.04 rows=45794904
>>> >> > width=193)
>>> >> >   ->  Index Scan using tp1_geom_gist on tp1 p  (cost=0.15..0.42
>>> >> > rows=1
>>> >> > width=276)
>>> >> >         Index Cond: (b.geom && geom)
>>> >> >         Filter: _st_intersects(b.geom, geom)
>>> >> >
>>> >> > vs
>>> >> >
>>> >> > EXPLAIN SELECT p.id, b.id
>>> >> > FROM tmp.tp1 p join ref.batiment b on p.geom && b.geom and
>>> >> > st_intersects(b.geom, p.geom);
>>> >> >
>>> >> > QUERY PLAN
>>> >> > Nested Loop  (cost=0.42..118443.25 rows=1 width=35)
>>> >> >   ->  Seq Scan on tp1 p  (cost=0.00..842.75 rows=13675 width=276)
>>> >> >   ->  Index Scan using batiment_geom_gist on batiment b
>>> >> > (cost=0.42..8.59
>>> >> > rows=1 width=193)
>>> >> >         Index Cond: ((p.geom && geom) AND (geom && p.geom))
>>> >> >         Filter: _st_intersects(geom, p.geom)
>>> >> >
>>> >> > PG:
>>> >> > PostgreSQL 9.6.6 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu
>>> >> > 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609, 64-bit
>>> >> >
>>> >> > Postgis:
>>> >> > POSTGIS="2.3.3 r15473" GEOS="3.5.1-CAPI-1.9.1 r4246" PROJ="Rel.
>>> >> > 4.9.2,
>>> >> > 08
>>> >> > September 2015" GDAL="GDAL 1.11.3, released 2015/09/16"
>>> >> > LIBXML="2.9.3"
>>> >> > LIBJSON="0.11.99" TOPOLOGY (topology procs from "2.3.2 r15302" need
>>> >> > upgrade)
>>> >> > RASTER
>>> >> >
>>> >> > Thanks for any hint.
>>> >> >
>>> >> > Nicolas
>>> >> >
>>> >> > _______________________________________________
>>> >> > postgis-users mailing list
>>> >> > postgis-users at lists.osgeo.org
>>> >> > https://lists.osgeo.org/mailman/listinfo/postgis-users
>>> >> _______________________________________________
>>> >> postgis-users mailing list
>>> >> postgis-users at lists.osgeo.org
>>> >> https://lists.osgeo.org/mailman/listinfo/postgis-users
>>> >
>>> >
>>> >
>>> > _______________________________________________
>>> > postgis-users mailing list
>>> > postgis-users at lists.osgeo.org
>>> > https://lists.osgeo.org/mailman/listinfo/postgis-users
>>> _______________________________________________
>>> postgis-users mailing list
>>> postgis-users at lists.osgeo.org
>>> https://lists.osgeo.org/mailman/listinfo/postgis-users
>>
>>
>> _______________________________________________
>> postgis-users mailing list
>> postgis-users at lists.osgeo.org
>> https://lists.osgeo.org/mailman/listinfo/postgis-users
>
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/postgis-users


More information about the postgis-users mailing list