[postgis-users] Query uses too much memory

Paul Ramsey pramsey at cleverelephant.ca
Wed Jan 21 15:29:06 PST 2009


Upgrade to 1.3.5, it looks like you are doing point-in-polygon
intersects, and there were some massive leaks there that I fixed in
later revisions.

P.

On Wed, Jan 21, 2009 at 3:15 PM, Reid Priedhorsky <reid at umn.edu> wrote:
> Dear all,
>
> I have a fairly simple (or so I thought) PostGIS query which eats my
> machine: it consumes more and more memory until the machine begins to thrash
> and becomes unresponsive. I have 1GB memory, and AFAICT Postgres is
> configured to use a fairly modest portion of that. The query consumes at
> least 1200GB. It is OK if the query takes a while, but I can't have it
> crashing my box. ;)
>
> Here is the query:
>
>> insert into wh_viewport_familiarity (username, viewport_id, score)
>> select
>>   username,
>>   wv.id as viewport_id,
>>   sum(score) as score
>> from
>>   wh_familiarity wf
>>   join wh_viewport wv on ST_Intersects(wf.geometry, wv.geometry)
>> group by username, viewport_id;
>
> Note that wh_familiarity is a view, defined as:
>
>>  SELECT pf.username, pf.score, p.geometry
>>   FROM wh_point_familiarity pf
>>   JOIN point p ON pf.point_id = p.id AND pf.point_version = p.version
>> UNION ALL  SELECT bf.username, bf.score, bs.geometry
>>   FROM wh_byway_familiarity bf
>>   JOIN byway_segment bs ON bf.byway_id = bs.id AND bf.byway_version =
>> bs.version;
>
> Here is the EXPLAIN:
>
>>                                                      QUERY PLAN
>>
>>  ----------------------------------------------------------------------------------------------------------------------
>>  Subquery Scan "*SELECT*"  (cost=969612.43..970964.38 rows=45065 width=44)
>>   ->  HashAggregate  (cost=969612.43..970175.74 rows=45065 width=40)
>>         ->  Nested Loop  (cost=81.60..969274.44 rows=45065 width=40)
>>               Join Filter: _st_intersects(wf.geometry, wv.geometry)
>>               ->  Append  (cost=81.60..397329.54 rows=1997133 width=68)
>>                     ->  Hash Join  (cost=81.60..5507.26 rows=71222
>> width=40)
>>                           Hash Cond: ((pf.point_id = p.id) AND
>> (pf.point_version = p.version))
>>                           ->  Seq Scan on wh_point_familiarity pf
>>  (cost=0.00..1197.24 rows=70324 width=23)
>>                           ->  Hash  (cost=53.04..53.04 rows=1904 width=33)
>>                                 ->  Seq Scan on point p  (cost=0.00..53.04
>> rows=1904 width=33)
>>                     ->  Merge Join  (cost=335494.26..371850.95
>> rows=1925911 width=112)
>>                           Merge Cond: ((bs.version = bf.byway_version) AND
>> (bs.id = bf.byway_id))
>>                           ->  Sort  (cost=31806.55..32247.51 rows=176382
>> width=105)
>>                                 Sort Key: bs.version, bs.id
>>                                 ->  Seq Scan on byway_segment bs
>>  (cost=0.00..6184.82 rows=176382 width=105)
>>                           ->  Sort  (cost=303687.71..308950.44
>> rows=2105094 width=23)
>>                                 Sort Key: bf.byway_version, bf.byway_id
>>                                 ->  Seq Scan on wh_byway_familiarity bf
>>  (cost=0.00..35824.94 rows=2105094 width=23)
>>               ->  Index Scan using wh_viewport_gist on wh_viewport wv
>>  (cost=0.00..0.27 rows=1 width=117)
>>                     Index Cond: (wf.geometry && wv.geometry)
>>                     Filter: (wf.geometry && wv.geometry)
>
> Adding "where wv.id % 10 = 0" barely affects the plan/predicted cost, and
> the query finishes in ~15 min with 468M memory consumed at peak.
>
> Full schema of these tables at http://www.cs.umn.edu/~reid/schema.txt
>
> Row counts:
>
> wh_viewport              13,539
> wh_point_familiarity     70,324
> point                     1,904
> wh_byway_familiarity  2,105,094
> byway_segment           176,382
>
> We are on Postgres 8.2.7 on Ubuntu Hardy. postgis_full_version() is:
>
>>  POSTGIS="1.3.3" GEOS="3.0.0-CAPI-1.4.1" PROJ="Rel. 4.6.0, 21 Dec 2007"
>> USE_STATS
>
> postgresql.conf is at http://www.cs.umn.edu/~reid/postgresql.conf
>
> Any ideas? Is this something upgrading to GEOS 3.0.3 might help with? (And
> where's the release notes for GEOS? I downloaded the 3.0.3 tarball and
> ChangeLog stops with 3.0.0.)
>
> Thanks so much for your help,
>
> Reid
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
>



More information about the postgis-users mailing list