[postgis-devel] ST_Contains(POLYGON, POINT) memory leak
Paul Ramsey
pramsey at cleverelephant.ca
Thu Sep 25 10:43:43 PDT 2008
Kevin,
I've committed two changes to SVN now that make this case leak a *lot*
less. You'll see no particular performance gain on your test workload,
since your workload is pathological with respect to caching indexed
objects (you have only one test point per polygon, so the time spend
indexing is wasted), but the memory use is much better.
It's still not optimal however... there's memory coming out, so I
might have to try the ugly hack to make the memory allocation visible
to valgrind.
FYI, in terms of the speed of the shortcut code, on a non-pathological
case (about 100 points per test polygon), the shortcut version returns
in 3s, the standard code in 100s.
Paul
On Tue, Sep 23, 2008 at 10:21 AM, Kevin Neufeld
<kneufeld at refractions.net> wrote:
> I'm not sure. Is the PIP common to those functions? Unfortunately, I don't
> have time to work on this now, but Martin said I could send you the polygons
> I'm using (they're the really large wetlands in the peace region for CWB)
> so you can see if you can reproduce the problem ... it's possible it's just
> a bogus install I have here.
>
> ftp://ftp.refractions.net/pub/refractions/ftp_postgis/test_pip.Fc.dmp
> use pg_restore to extract
> /opt/pgsql83/bin/pg_restore -Fc -O test_pip.Fc.dmp | psql postgis
>
> I put both of my test tables in a tmp schema for this dmp file.
>
> Cheers,
> -- Kevin
>
> Paul Ramsey wrote:
>>
>> Is it limited to P-i-P?
>>
>> P
>>
>> On Tue, Sep 23, 2008 at 8:56 AM, Kevin Neufeld <kneufeld at refractions.net>
>> wrote:
>>>
>>> By the way, this is not limited to ST_Contains.
>>>
>>> ST_Within : 1.7GB - 16sec
>>> ST_Intersects : 1.7GB - 16sec
>>> ST_CoveredBy : 1.7GB - 16sec
>>> ST_Distance = 0 : 26MB - 5sec
>>> ST_Relate : 32MB - 1:30sec
>>>
>>> -- Kevin
>>>
>>> Kevin Neufeld wrote:
>>>>
>>>> I'm still able to reproduce the memory leak. It looks like you're right
>>>> though, it's nothing to do with GEOS, but rather the PIP in PostGIS.
>>>> I'm not sure what is going on yet, (I have yet to find a specific test
>>>> case) but here are my findings.
>>>>
>>>> I made two little sh scripts to track memory usage (psql82.sh and
>>>> psql83.sh), each producing a log file. I happen to run the same GEOS
>>>> version in both instances, but running pgis 1.1.6 on 8.2 and 1.3.3 on
>>>> 8.3
>>>>
>>>> I have a sample table of ~120000 polygons and for testing, generated a
>>>> second table of points using pointonsurface. Running the same query on
>>>> both
>>>> instances (using contains() on a table join of the two tables using 150
>>>> points:
>>>> - pgis 1.1.6 finishes in 1:30secs using 30MB of memory. - pgis 1.3.3
>>>> finishes in 12sec using 1.7GB of memory.
>>>>
>>>> If I query an extra 50 points, the 1.3.3 instance runs out of memory
>>>> with
>>>> the attached crash log.
>>>>
>>>> Next, I'll try pgis SVN head on the 8.3 instance instead of 1.3.3 (but
>>>> not tonite :) ).
>>>>
>>>> Thoughts?
>>>> -- Kevin
>>>>
>>>>
>>>>
>>>> Paul Ramsey wrote:
>>>>>
>>>>> OK, I ran Mark's P-i-P short circuit through valgrind and it doesn't
>>>>> leak anything at all. It also is sort of hard to get to run, since
>>>>> it's only strict POLYGON and POINT, MULTIPOLYGON need not apply.
>>>>> P-i-mP didn't leak either, so I'm not sure what to make of Kevin's
>>>>> result at this point.
>>>>>
>>>>> Only difference right now is I'm running postgis-trunk and geos-trunk
>>>>> for my testing.
>>>>>
>>>>> Paul
>>>>>
>>>>> On Mon, Sep 22, 2008 at 12:53 PM, Paul Ramsey
>>>>> <pramsey at cleverelephant.ca>
>>>>> wrote:
>>>>>
>>>>>> Yes, it points that the problem might not be in GEOS after all, it
>>>>>> might be in how the memory-context handling is happening.
>>>>>>
>>>>>> In PostGIS 1.3.3, the Contains and Intersects tests have a P-i-P
>>>>>> short-circuit that uses a cached 1-d r-tree to do a version of
>>>>>> prepared geometry testing (Mark Leslie implemented this). Ben then
>>>>>> copied that code for his memory caching work that used the GEOS
>>>>>> prepared geometry.
>>>>>>
>>>>>> Try going into the lwgeom_geos_c.c file and commenting out the
>>>>>> mem-caching code path for one of your tests and see if the memory
>>>>>> problem goes away,
>>>>>>
>>>>>> *or*
>>>>>>
>>>>>> Re-run your test, replacing the points with short two-point lines.
>>>>>>
>>>>>> Since Mark's code was in turn copied from the proj4 projection object
>>>>>> caching routine (lwgeom_transform.c) that code might *also* leak,
>>>>>> although since the number of cache transitions is low, it might not be
>>>>>> so noticeable.
>>>>>>
>>>>>> Paul
>>>>>>
More information about the postgis-devel
mailing list