[postgis-users] improve query - summarize building count and area on each parcel
Regina Obe
lr at pcorp.us
Mon Mar 27 15:15:11 PDT 2023
Karsten,
So you had the centroids indexed before you switched to regular join and
that took days too?
I would expect the index on the centroid to make much more of a difference
than the LEFT / INNER JOIN change.
I think the LEFT / INNER JOIN performance would have a lot to do with number
of rows in each table and whether it resorts to a nested vs. hash.
There have been performance issues in PostgreSQL that cause LEFT to be much
slower than INNER, but I think most of those have been long resolved.
It might be some left over stuff with our poor spatial stats estimates.
What does
SELECT version(), postgis_full_version();
return
From: postgis-users [mailto:postgis-users-bounces at lists.osgeo.org] On Behalf
Of karsten
Sent: Monday, March 27, 2023 5:16 PM
To: 'PostGIS Users Discussion' <postgis-users at lists.osgeo.org>
Subject: Re: [postgis-users] improve query - summarize building count and
area on each parcel
That worked perfectly:
Using just join and setting the index on the centroids geom as well
I had the query finish in less than a minute (compared to three day running
and not yet finished !)
Does anyone have an explanation why 'regular' join is so much faster than
'left join' ?
Cheers
Karsten
_____
From: postgis-users [mailto:postgis-users-bounces at lists.osgeo.org] On Behalf
Of Marcin Mionskowski
Sent: Sunday, March 26, 2023 23:34
To: PostGIS Users Discussion
Subject: Re: [postgis-users] improve query - summarize building count and
area on each parcel
In my "explain tests", made on much smaller tables, I get 500 times smaller
costs only by changing LEFT JOIN to "regular" JOIN.
pon., 27 mar 2023 o 01:07 Raj Talati <rajtalati at gmail.com
<mailto:rajtalati at gmail.com> > napisał(a):
I prefer doing insert instead of update if we doing all rows update :) so
create New table with new table and copy from old and later drop it you can
go without index and do index later after copy. Postgres update is delete
and insert internally and direct big update will bloat table.
Other thing you can do multi thread if you have 16 core with nothing else
can go 12-14 threads updates which can do portion of your big table easy if
no good criteria can be mod function on PK
Hope this helps .
Raj T
On Sun, Mar 26, 2023 at 6:54 PM karsten <karsten at terragis.net
<mailto:karsten at terragis.net> > wrote:
Hi All,
I have two tables, a parcel (polygon) table and a building (polygon) table
in a PostGIS db.
There are about 1 million parcels and over a million buildings in these
tables ...
I have set gist indexes on both and clustered each on a geohash, pkey is set
on objectid.
I would like to fill 2 additional columns for the parcels layer to have the
number of building on each parcel (building_count) and the summed up area of
all those building footprints (building_fp).
Below is an update query but it is extremely lengthy and inefficient (on a
beefy 16 core Ubuntu server with 128 GB tons of RAM) : I had to stop an
after running 3 days when it was not finished ...
I am wondering if anyone has an idea how to make it more efficient ?
This is the query:
UPDATE parcels p SET
building_count = subquery.num,
building_fp = subquery.sumarea
from
(select objectid, count(*) as num, sum(st_area(b.geom)) as sumarea FROM
parcels p
left join buildings b on st_intersects(p.geom,st_centroid(b.geom)) group by
objectid ) subquery
where p.objectid = subquery.objectid ;
Any suggestions ?
Cheers
Karsten Vennemann
_______________________________________________
postgis-users mailing list
postgis-users at lists.osgeo.org <mailto:postgis-users at lists.osgeo.org>
https://lists.osgeo.org/mailman/listinfo/postgis-users
_______________________________________________
postgis-users mailing list
postgis-users at lists.osgeo.org <mailto:postgis-users at lists.osgeo.org>
https://lists.osgeo.org/mailman/listinfo/postgis-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20230327/2d399ea6/attachment.htm>
More information about the postgis-users
mailing list