<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<HEAD>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
<META NAME="Generator" CONTENT="MS Exchange Server version 6.5.7652.24">
<TITLE>RE: [postgis-devel] ST_Contains(POLYGON, POINT) memory leak</TITLE>
</HEAD>
<BODY>
<!-- Converted from text/plain format -->
<P><FONT SIZE=2>Paul,<BR>
<BR>
Since you are sort of in that neck of the woods anyway - you feel brave enough to<BR>
try to get the MULTIPOLY contains remarked out short-cut case to work in contains() function in lwgeom_geos_c.c.<BR>
<BR>
I think I had tried uncommenting the code and compiling and getting a compile<BR>
error, but admittedly I am compile challenged so could be just something<BR>
stupid I am doing.<BR>
<BR>
The short-cut I don't think exists in the other functions, but if you can get it to work in one, don't see why you wouldn't be able to in others.<BR>
<BR>
Thanks,<BR>
Regina<BR>
<BR>
-----Original Message-----<BR>
From: postgis-devel-bounces@postgis.refractions.net on behalf of Paul Ramsey<BR>
Sent: Thu 9/25/2008 1:43 PM<BR>
To: Kevin Neufeld; Chris Hodgson<BR>
Cc: PostGIS Development Discussion<BR>
Subject: [postgis-devel] ST_Contains(POLYGON, POINT) memory leak<BR>
<BR>
Kevin,<BR>
<BR>
I've committed two changes to SVN now that make this case leak a *lot*<BR>
less. You'll see no particular performance gain on your test workload,<BR>
since your workload is pathological with respect to caching indexed<BR>
objects (you have only one test point per polygon, so the time spend<BR>
indexing is wasted), but the memory use is much better.<BR>
<BR>
It's still not optimal however... there's memory coming out, so I<BR>
might have to try the ugly hack to make the memory allocation visible<BR>
to valgrind.<BR>
<BR>
FYI, in terms of the speed of the shortcut code, on a non-pathological<BR>
case (about 100 points per test polygon), the shortcut version returns<BR>
in 3s, the standard code in 100s.<BR>
<BR>
Paul<BR>
<BR>
On Tue, Sep 23, 2008 at 10:21 AM, Kevin Neufeld<BR>
<kneufeld@refractions.net> wrote:<BR>
> I'm not sure. Is the PIP common to those functions? Unfortunately, I don't<BR>
> have time to work on this now, but Martin said I could send you the polygons<BR>
> I'm using (they're the really large wetlands in the peace region for CWB)<BR>
> so you can see if you can reproduce the problem ... it's possible it's just<BR>
> a bogus install I have here.<BR>
><BR>
> <A HREF="ftp://ftp.refractions.net/pub/refractions/ftp_postgis/test_pip.Fc.dmp">ftp://ftp.refractions.net/pub/refractions/ftp_postgis/test_pip.Fc.dmp</A><BR>
> use pg_restore to extract<BR>
> /opt/pgsql83/bin/pg_restore -Fc -O test_pip.Fc.dmp | psql postgis<BR>
><BR>
> I put both of my test tables in a tmp schema for this dmp file.<BR>
><BR>
> Cheers,<BR>
> -- Kevin<BR>
><BR>
> Paul Ramsey wrote:<BR>
>><BR>
>> Is it limited to P-i-P?<BR>
>><BR>
>> P<BR>
>><BR>
>> On Tue, Sep 23, 2008 at 8:56 AM, Kevin Neufeld <kneufeld@refractions.net><BR>
>> wrote:<BR>
>>><BR>
>>> By the way, this is not limited to ST_Contains.<BR>
>>><BR>
>>> ST_Within : 1.7GB - 16sec<BR>
>>> ST_Intersects : 1.7GB - 16sec<BR>
>>> ST_CoveredBy : 1.7GB - 16sec<BR>
>>> ST_Distance = 0 : 26MB - 5sec<BR>
>>> ST_Relate : 32MB - 1:30sec<BR>
>>><BR>
>>> -- Kevin<BR>
>>><BR>
>>> Kevin Neufeld wrote:<BR>
>>>><BR>
>>>> I'm still able to reproduce the memory leak. It looks like you're right<BR>
>>>> though, it's nothing to do with GEOS, but rather the PIP in PostGIS.<BR>
>>>> I'm not sure what is going on yet, (I have yet to find a specific test<BR>
>>>> case) but here are my findings.<BR>
>>>><BR>
>>>> I made two little sh scripts to track memory usage (psql82.sh and<BR>
>>>> psql83.sh), each producing a log file. I happen to run the same GEOS<BR>
>>>> version in both instances, but running pgis 1.1.6 on 8.2 and 1.3.3 on<BR>
>>>> 8.3<BR>
>>>><BR>
>>>> I have a sample table of ~120000 polygons and for testing, generated a<BR>
>>>> second table of points using pointonsurface. Running the same query on<BR>
>>>> both<BR>
>>>> instances (using contains() on a table join of the two tables using 150<BR>
>>>> points:<BR>
>>>> - pgis 1.1.6 finishes in 1:30secs using 30MB of memory. - pgis 1.3.3<BR>
>>>> finishes in 12sec using 1.7GB of memory.<BR>
>>>><BR>
>>>> If I query an extra 50 points, the 1.3.3 instance runs out of memory<BR>
>>>> with<BR>
>>>> the attached crash log.<BR>
>>>><BR>
>>>> Next, I'll try pgis SVN head on the 8.3 instance instead of 1.3.3 (but<BR>
>>>> not tonite :) ).<BR>
>>>><BR>
>>>> Thoughts?<BR>
>>>> -- Kevin<BR>
>>>><BR>
>>>><BR>
>>>><BR>
>>>> Paul Ramsey wrote:<BR>
>>>>><BR>
>>>>> OK, I ran Mark's P-i-P short circuit through valgrind and it doesn't<BR>
>>>>> leak anything at all. It also is sort of hard to get to run, since<BR>
>>>>> it's only strict POLYGON and POINT, MULTIPOLYGON need not apply.<BR>
>>>>> P-i-mP didn't leak either, so I'm not sure what to make of Kevin's<BR>
>>>>> result at this point.<BR>
>>>>><BR>
>>>>> Only difference right now is I'm running postgis-trunk and geos-trunk<BR>
>>>>> for my testing.<BR>
>>>>><BR>
>>>>> Paul<BR>
>>>>><BR>
>>>>> On Mon, Sep 22, 2008 at 12:53 PM, Paul Ramsey<BR>
>>>>> <pramsey@cleverelephant.ca><BR>
>>>>> wrote:<BR>
>>>>><BR>
>>>>>> Yes, it points that the problem might not be in GEOS after all, it<BR>
>>>>>> might be in how the memory-context handling is happening.<BR>
>>>>>><BR>
>>>>>> In PostGIS 1.3.3, the Contains and Intersects tests have a P-i-P<BR>
>>>>>> short-circuit that uses a cached 1-d r-tree to do a version of<BR>
>>>>>> prepared geometry testing (Mark Leslie implemented this). Ben then<BR>
>>>>>> copied that code for his memory caching work that used the GEOS<BR>
>>>>>> prepared geometry.<BR>
>>>>>><BR>
>>>>>> Try going into the lwgeom_geos_c.c file and commenting out the<BR>
>>>>>> mem-caching code path for one of your tests and see if the memory<BR>
>>>>>> problem goes away,<BR>
>>>>>><BR>
>>>>>> *or*<BR>
>>>>>><BR>
>>>>>> Re-run your test, replacing the points with short two-point lines.<BR>
>>>>>><BR>
>>>>>> Since Mark's code was in turn copied from the proj4 projection object<BR>
>>>>>> caching routine (lwgeom_transform.c) that code might *also* leak,<BR>
>>>>>> although since the number of cache transitions is low, it might not be<BR>
>>>>>> so noticeable.<BR>
>>>>>><BR>
>>>>>> Paul<BR>
>>>>>><BR>
_______________________________________________<BR>
postgis-devel mailing list<BR>
postgis-devel@postgis.refractions.net<BR>
<A HREF="http://postgis.refractions.net/mailman/listinfo/postgis-devel">http://postgis.refractions.net/mailman/listinfo/postgis-devel</A><BR>
<BR>
</FONT>
</P>
</BODY>
</HTML>
<HTML><BODY><P><hr size=1></P>
<P><STRONG>
The substance of this message, including any attachments, may be confidential, legally privileged and/or exempt from disclosure pursuant to Massachusetts law. It is intended solely for the addressee. If you received this in error, please contact the sender and delete the material from any computer.
</STRONG></P></BODY></HTML>
<P><hr size=1></P>
<P><STRONG><font size="2" color="339900"> Help make the earth a greener place. If at all possible resist printing this email and join us in saving paper. </p> <p> </font></STRONG></P>