[postgis-devel] What's the differencebetweengeom_accum andbuiltin postgresql array_append

Obe, Regina robe.dnd at cityofboston.gov
Wed Sep 3 04:49:08 PDT 2008


Slightly embarassing flaw in my testing.  I just realized my windows test was on a test database where I had already altered the behavior of st_geom_accum to use array_append.  So it was testing something I ignorantly falsely took for truth and should have tested before.
 
The unadulterated comparison is - the below -  so array_append still appears to perform better but by much less.
--42,781 ms 
SELECT ST_Numgeometries(ST_Collect(the_geom))
FROM usstatebounds;

-- 36,359 ms
SELECT ST_Numgeometries(ST_Collect2(the_geom))
FROM usstatebounds;
 
Getting back to the embarrasment. I guess this means my assumption that simple SQL  functions are inlined in aggregate functions just as it is in any other use is wrong so can't implement st_geom_accum with an sql wrapper without penalty.  I'll try to find some other sets of data to test with and cross compare between 8.2 and 8.3 installs.
 

Thanks,
Regina

________________________________

From: postgis-devel-bounces at postgis.refractions.net on behalf of Obe, Regina
Sent: Wed 9/3/2008 12:12 AM
To: PostGIS Development Discussion
Subject: RE: [postgis-devel] What's the differencebetweengeom_accum andbuiltin postgresql array_append



Mark,

This is interesting.  I've been doing all my tests on Union and since the bottleneck has always been ST_Union I wasn't seeing much of a difference in timings, but to just test the raw speed of array_append vs. st_geom_accum,  I decided doing a collect compare would be better since collect pretty much just appends and does a simple collect. 

st_geom_accum actually performs worse most of the time for largish geometries.  Note I also put in the ST_NumGeometries test to limit the latency of network transfer.

My ST_Collect2 looks like this
CREATE AGGREGATE st_collect2(geometry) (
  SFUNC=array_append,
  STYPE=geometry[],
  FINALFUNC=st_collect_garray
);

On my standard usstatebounds sample set -
POSTGIS="1.3.3" GEOS="3.0.0-CAPI-1.4.1" PROJ="Rel. 4.6.0, 21 Dec 2007" (Windows 2003)
"PostgreSQL 8.2.5 on i686-pc-mingw32, compiled by GCC gcc.exe (GCC) 3.4.2 (mingw-special)"

--55,781 ms (result: 2895 )
SELECT ST_Numgeometries(ST_Collect(the_geom))
FROM usstatebounds

-- 36,359 ms (result: 2895)
SELECT ST_Numgeometries(ST_Collect2(the_geom))
FROM usstatebounds

-- Max number of points of any single geometry is 70,238 points, avg - 140.5
SELECT Max(ST_NPoints(the_geom)), Avg(ST_NPoints(the_geom))
FROM
usstatebounds

(unfortunately I don't have a similar data set on my Linux box) - testing with a smaller number of records with largish geoms which I can do a comparable test). 
Tests on linux  ("PostgreSQL 8.3.3 on i686-redhat-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20070626 (Red Hat 4.1.2-14)")
This test on linux with a 8.3 gives same timings POSTGIS="1.3.3" GEOS="3.0.0-CAPI-1.4.1" PROJ="Rel. 4.5.0, 22 Oct 2006" USE_STATS"
--812 ms (result - 56)
SELECT ST_Numgeometries(ST_Collect(the_geom))
FROM  fe_2007_us_state

-- 812 ms (result - 56)
SELECT ST_Numgeometries(ST_Collect2(the_geom))
FROM fe_2007_us_state

-- max 45,056, avg 12,130
SELECT Max(ST_NPoints(the_geom)), Avg(ST_NPoints(the_geom))
FROM
fe_2007_us_state


--437 ms (result - 209)
SELECT ST_Numgeometries(ST_Collect(the_geom))
FROM country_boundaries

-- 312 ms (result - 209)
SELECT ST_Numgeometries(ST_Collect2(the_geom))
FROM country_boundaries

-- max 12,125, avg 455
SELECT Max(ST_NPoints(the_geom)), Avg(ST_NPoints(the_geom))
FROM
country_boundaries

Thanks,
Regina
-----Original Message-----
From: postgis-devel-bounces at postgis.refractions.net on behalf of Mark Cave-Ayland
Sent: Tue 9/2/2008 6:23 PM
To: PostGIS Development Discussion
Subject: Re: [postgis-devel] What's the difference betweengeom_accum    andbuiltin postgresql array_append

Obe, Regina wrote:
>
> Getting back to this topic.  Recall I said there was a situation where
> LWGEOM_accum worked where array_append didn't.  I can't replicate it so
> I suspect it was an earlier error in my indexing efforts that caused
> this issue.  I am also not seeing this LWGEOM_accum function used
> directly in any of the postgis code base.
>
> So here is my half-assed suggestion since no one seems to know why we
> should maintain the LWGEOM_accum function instead of just using built-in
> PostgreSQL array_append function.
>
> Simply get rid of LWGEOM_accum from our code base (or deprecate it) --
> change
>
> /**CREATE OR REPLACE FUNCTION st_geom_accum(geometry[], geometry)
>   RETURNS geometry[] AS
> '$libdir/liblwgeom.dll', 'LWGEOM_accum'
>   LANGUAGE 'c' IMMUTABLE;
> ***/
>
> To:
> CREATE OR REPLACE FUNCTION st_geom_accum(geometry[], geometry)
>   RETURNS geometry[] AS
> $$ SELECT array_append($1,$2);  $$
>   LANGUAGE 'sql' IMMUTABLE;
>
>
> And that's it.  In tests I've done doing above, I don't see a definitive
> speed difference and all aggs we have seem to have this function for
> collecting geometries.
>
> I would have suggested replacing our use in aggregates directly with
> array_append, but thought it would be better to keep this buffer for 3
> reasons
> 1) We don't have a CREATE OR REPLACE AGGREGATE which makes changing aggs
> without a drop reload tricky. Changing this function will auto-magically
> correct the aggs. Granted 1.4 people should be doing a full reload anyway.
>
> 2) Don't know if anyone is relying on the st_geom_accum function so we
> probably should keep it around anyway.
>
> 3) In case we are wrong.  It won't be a major undertaking to correct the
> mistake.
>
> Any thoughts, concerns?
>
> Thanks,
> Regina


Hi Regina,

At a quick glance, I'd say that the LWGEOM_accum function forcibly
de-TOASTs the input geometries when building the array so it may be that
you will only see a time difference working very large geometries
(rather than large numbers of smaller geometries). Does this make any
difference in your tests?


ATB,

Mark.

--
Mark Cave-Ayland
Sirius Corporation - The Open Source Experts
http://www.siriusit.co.uk <http://www.siriusit.co.uk/> 
T: +44 870 608 0063
_______________________________________________
postgis-devel mailing list
postgis-devel at postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-devel






________________________________

The substance of this message, including any attachments, may be confidential, legally privileged and/or exempt from disclosure pursuant to Massachusetts law. It is intended solely for the addressee. If you received this in error, please contact the sender and delete the material from any computer. 

________________________________

Help make the earth a greener place. If at all possible resist printing this email and join us in saving paper. 



-----------------------------------------
The substance of this message, including any attachments, may be
confidential, legally privileged and/or exempt from disclosure
pursuant to Massachusetts law. It is intended
solely for the addressee. If you received this in error, please
contact the sender and delete the material from any computer.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-devel/attachments/20080903/e1c72e14/attachment.html>


More information about the postgis-devel mailing list