[postgis-users] Much Faster Point in Polygon Count using KEYWORD LATERAL
Mark Wynter
mark at dimensionaledge.com
Sat Mar 7 17:58:32 PST 2015
Hi All
To provide some context, I’m in the midst of refactoring my recursive quadgrid function which I’ll share with everyone in the next few days.
I’ve got a table with many millions of points and I so I’ve been playing with different query patterns to find the fastest way possible of counting points within a set of quadcells.
The way I’ve done this traditionally is to put a GIST index on the points table, and then use a CROSS JOIN query of the form:
SELECT l.the_geom, count(p.pid)
FROM
(SELECT DE_MakeRegularQuadCells(wkb_geometry) as the_geom FROM abs_aus11_tiles_32k WHERE tid IN (17865)) l,
tutorials.abs_mb11_points p
WHERE ST_Intersects(l.the_geom, p.wkb_geometry) AND l.the_geom && p.wkb_geometry GROUP BY 1;
“0103000...";688154
“0103000...";473202
“0103000...";84516
About 4.4 seconds for sum 1.2 million points (not startling). Plus there’s a grid cell with zero points, hence this geometry doesn’t get returned by this query pattern. Bummer. When I look at the query plan, the GROUP BY clause seems to be acting as a significant handbrake on the query performance.
I’ve also looked at a LEFT JOIN instead of a CROSS JOIN to solve the problem of the non-returning quad cell, but I understand GIST indexes are ineffectual on LEFT JOINS.
A faster way that I’ve found is to incorporate the KEYWORD ‘LATERAL’ into the join. This allows me to reference the left table in the right table:
SELECT l.the_geom, r.pcount
FROM
(SELECT DE_MakeRegularQuadCells(wkb_geometry) as the_geom FROM abs_aus11_tiles_32k WHERE tid IN (17865)) l,
LATERAL
(SELECT count(*) as pcount, l.the_geom FROM tutorials.abs_mb11_points WHERE ST_Intersects(l.the_geom, wkb_geometry) AND l.the_geom && wkb_geometry) r;
"0103000…"; 473202
"0103000…"; 84516
"0103000…”; 0
"0103000…"; 688154
This cuts the query time down to 1.7 seconds, mostly be avoiding the need to use GROUP BY, and all quadcells are returned even those with zero points in polygons. In effect the query appears to be mimicking a LEFT JOIN.
I was expecting that the LATERAL CROSS JOIN would have produced the cartesian product of the two tables, and that I would have to enforce a WHERE condition ST_Equals(l.the_geom, r.the_geom);. Alas not.
Just wondering if anyone else has encountered this when using LATERAL, and could shed some light on why this is the case?
More information about the postgis-users
mailing list