[postgis-users] Query uses too much memory
Paul Ramsey
pramsey at cleverelephant.ca
Wed Jan 21 15:29:06 PST 2009
Upgrade to 1.3.5, it looks like you are doing point-in-polygon
intersects, and there were some massive leaks there that I fixed in
later revisions.
P.
On Wed, Jan 21, 2009 at 3:15 PM, Reid Priedhorsky <reid at umn.edu> wrote:
> Dear all,
>
> I have a fairly simple (or so I thought) PostGIS query which eats my
> machine: it consumes more and more memory until the machine begins to thrash
> and becomes unresponsive. I have 1GB memory, and AFAICT Postgres is
> configured to use a fairly modest portion of that. The query consumes at
> least 1200GB. It is OK if the query takes a while, but I can't have it
> crashing my box. ;)
>
> Here is the query:
>
>> insert into wh_viewport_familiarity (username, viewport_id, score)
>> select
>> username,
>> wv.id as viewport_id,
>> sum(score) as score
>> from
>> wh_familiarity wf
>> join wh_viewport wv on ST_Intersects(wf.geometry, wv.geometry)
>> group by username, viewport_id;
>
> Note that wh_familiarity is a view, defined as:
>
>> SELECT pf.username, pf.score, p.geometry
>> FROM wh_point_familiarity pf
>> JOIN point p ON pf.point_id = p.id AND pf.point_version = p.version
>> UNION ALL SELECT bf.username, bf.score, bs.geometry
>> FROM wh_byway_familiarity bf
>> JOIN byway_segment bs ON bf.byway_id = bs.id AND bf.byway_version =
>> bs.version;
>
> Here is the EXPLAIN:
>
>> QUERY PLAN
>>
>> ----------------------------------------------------------------------------------------------------------------------
>> Subquery Scan "*SELECT*" (cost=969612.43..970964.38 rows=45065 width=44)
>> -> HashAggregate (cost=969612.43..970175.74 rows=45065 width=40)
>> -> Nested Loop (cost=81.60..969274.44 rows=45065 width=40)
>> Join Filter: _st_intersects(wf.geometry, wv.geometry)
>> -> Append (cost=81.60..397329.54 rows=1997133 width=68)
>> -> Hash Join (cost=81.60..5507.26 rows=71222
>> width=40)
>> Hash Cond: ((pf.point_id = p.id) AND
>> (pf.point_version = p.version))
>> -> Seq Scan on wh_point_familiarity pf
>> (cost=0.00..1197.24 rows=70324 width=23)
>> -> Hash (cost=53.04..53.04 rows=1904 width=33)
>> -> Seq Scan on point p (cost=0.00..53.04
>> rows=1904 width=33)
>> -> Merge Join (cost=335494.26..371850.95
>> rows=1925911 width=112)
>> Merge Cond: ((bs.version = bf.byway_version) AND
>> (bs.id = bf.byway_id))
>> -> Sort (cost=31806.55..32247.51 rows=176382
>> width=105)
>> Sort Key: bs.version, bs.id
>> -> Seq Scan on byway_segment bs
>> (cost=0.00..6184.82 rows=176382 width=105)
>> -> Sort (cost=303687.71..308950.44
>> rows=2105094 width=23)
>> Sort Key: bf.byway_version, bf.byway_id
>> -> Seq Scan on wh_byway_familiarity bf
>> (cost=0.00..35824.94 rows=2105094 width=23)
>> -> Index Scan using wh_viewport_gist on wh_viewport wv
>> (cost=0.00..0.27 rows=1 width=117)
>> Index Cond: (wf.geometry && wv.geometry)
>> Filter: (wf.geometry && wv.geometry)
>
> Adding "where wv.id % 10 = 0" barely affects the plan/predicted cost, and
> the query finishes in ~15 min with 468M memory consumed at peak.
>
> Full schema of these tables at http://www.cs.umn.edu/~reid/schema.txt
>
> Row counts:
>
> wh_viewport 13,539
> wh_point_familiarity 70,324
> point 1,904
> wh_byway_familiarity 2,105,094
> byway_segment 176,382
>
> We are on Postgres 8.2.7 on Ubuntu Hardy. postgis_full_version() is:
>
>> POSTGIS="1.3.3" GEOS="3.0.0-CAPI-1.4.1" PROJ="Rel. 4.6.0, 21 Dec 2007"
>> USE_STATS
>
> postgresql.conf is at http://www.cs.umn.edu/~reid/postgresql.conf
>
> Any ideas? Is this something upgrading to GEOS 3.0.3 might help with? (And
> where's the release notes for GEOS? I downloaded the 3.0.3 tarball and
> ChangeLog stops with 3.0.0.)
>
> Thanks so much for your help,
>
> Reid
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
>
More information about the postgis-users
mailing list