[postgis-devel] GeomUnion memory

strk at refractions.net strk at refractions.net
Sun Jun 26 08:45:37 PDT 2005


Bill, others, I've made some tests on memory occupation,
implementing the "chunked" buffer() for the GeomUnion aggregate.

This means that ALL the input is still collected first in
form of a geometry[] (geometry array).

Chunking gathering of input would require a change in 
scripts (definitions), which would require a dump/reload
for now, so I'd like to avoid it for the moment.

A test with a 4000 simple polygons input shows the size of
the collected array to be about 32Mb.

Now. If we send the whole set to GEOS it will take up
to 413Mb. If we split those 4000 polys in chunks of 100
and feed the chunks to GEOS memory size is about 56Mb.
Note that the 32 of the input are kept until end
of operations, so the GEOS part takes 24Mb in the chunked
case and 318 in the full case.

Timings are comparable.

Note that ORDERING in the chunked case would probably have
an influence in both speed and memory occupation.

Also note that CHUNK size is currently defined in terms of
number of geometries (I've been thinking in using memory
size there).

Data follows

------------------------------------------------------------------------
1000 Polygons test:
        out_size = "183397"
        sum(in_size) = "1546256"
------------------------------------------------------------------------

        MAXGEOMS 100
                real    0m27.796s
                user    0m21.615s
                sys     0m4.930s
                MEMMAX: 24Mb

        MAXGEOMS 10000 (never hit)
                real    0m25.819s
                user    0m19.727s
                sys     0m5.059s
                MEMMAX: 98Mb

------------------------------------------------------------------------
2000 Polygons test:
        out_size = "197343"
        sum(in_size) = "2924256"
------------------------------------------------------------------------

        MAXGEOMS 100
                real    1m14.388s
                user    0m53.170s
                sys     0m18.376s
                RES: 32Mb

        MAXGEOMS 10000 (never hit)
                real    1m13.873s
                user    0m52.125s
                sys     0m18.823s
                RES: 186Mb

------------------------------------------------------------------------
4000 Polygons test:
        out_size = "244300"
        sum(in_size) = "6352537"
------------------------------------------------------------------------

        MAXGEOMS 100
                real    4m32.049s
                user    2m37.242s
                sys     1m22.942s
                RES: 56Mb (32Mb is size at end of collected array)

        MAXGEOMS 10000 (never hit)
                real    4m19.959s
                user    2m50.001s
                sys     1m15.982s
                RES: 413Mb




More information about the postgis-devel mailing list