[geos-devel] GEOSGetCoordinates() and performance (postgis bug fix)

Mon Oct 27 10:40:19 EST 2003

I'm working on the small-but-fast interface.
Can Chris check out serialization of either geos geometry or
some other intermediate class used for Overlay Operations 
(like PlanarGraph or something) ? Martin, any help on this ?

--strk;

pramsey wrote:
> I was talking about this with Chris a week ago, and we came to similar 
> conclusions, with one difference: we think we need to have both methods 
> available.  A CPU intensive but memory light version for large-but-slow 
> queries and an optimized but memory heavy version for small-but-fast 
> interactive queries.  Neither version can meet all the likely use cases 
> for a union aggregate on their own.
> P
> 
> On Monday, October 27, 2003, at 03:36 AM, strk wrote:
> 
> > Defining a new postgres type for geos-geom might give a better 
> > performance
> > but sounds (IMHO) like a slow development process. I've been thinking
> > about this thought. Here is what I propose:
> >
> > We now have, for each row in input relation:
> >
> > CASE 1:
> > 	ITERATE:
> > 	1) pgis->geos conversion (first arg)
> > 	2) pgis->geos conversion (second arg)
> > 	3) geosUnion  call
> > 	4) geos->pgis conversion (result)
> >
> > Note that step 1 will be for each iteration after the first EXACTLY the
> > output of step 3 from previous iteration. We could use a final func
> > for the whole job to be done and a state transaction function to just
> > gather informations about which geometries to work on.
> >
> > CASE 2:
> > 	ITERATE:
> > 	1) get geometry array (first arg)
> > 	2) get geometry (second arg)
> > 	3) append geometry to array
> > 	END:
> > 	1) UNION = array[0]->geos (conversion)
> > 	2) for each array item N > 0
> > 	3)	nextgeom = array[N]->geos (conversion)
> > 	4) 	geosUnion call (UNION, nextgeom)
> > 	5) end
> > 	6) geos->pgis conversion (UNION)
> >
> > With N input geometries we will have the following conversions:
> > 	CASE 1: 2*N pgis->geos + N geos->pgis (might be missing something)
> > 	CASE 2:   N pgis->geos + 1 geos->pgis
> >
> > Anyway, CASE 2 seems nicer with system resources.. but it might be
> > very memory expensive! If output geometry will be just a collection
> > of input geometries that big memory will be allocated anyway, thus
> > as long as we release each array element as soon as added to the
> > output geometry this will not be a big deal. On the other hand, if
> > resulting union will remove a lot of borders allocated memory will
> > be much more then the size required by step-by-step processing.
> >
> > What do you think about this solution ?
> >
> > --strk;
> >
> > chodgson wrote:
> >> The way postgres handles user-defined aggregate functions is what is 
> >> limiting
> >> you. The problem is that aggregates are designed to use a 
> >> "one-at-a-time"
> >> approach to aggregating, which means that at each step of the process 
> >> you must
> >> have "geom UNION geom = geom".
> >>
> >> Clearly, if there was a postgres type which was a serialized GEOS 
> >> gemetry (call
> >> it type geos_geom) then we could write a geos_geom_union function 
> >> which was
> >> geos_geom UNION geos_geom = geos_geom, and as long as the 
> >> serialization can be
> >> done quickly (and is even possible?) then a faster union aggregate 
> >> could result
> >> from a query like:
> >>
> >> SELECT geos_geom_to_geom( geos_geom_union( geom_to_geos_geom( 
> >> the-geom ) ) )
> >> FROM the_table GROUP BY class_id;
> >>
> >> I don't know enough C/C++ to know if such a serialization of a GEOS 
> >> geometry is
> >> possible, or fast... but if so, this would probably do it.
> >>
> >> It could even be possible to re-write all of the geos functions to 
> >> accept
> >> geos_geom objects, and have standard geom object be auto-typecast into
> >> geos_geom objects (using the "geom_to_geos_geom()" function, of 
> >> course)... then
> >> it would be easy to cache the geos objects in between processing 
> >> stages - who
> >> knows how big they might be... again, if this is even possible.
> >>
> >> Chris
> >>
> >> Quoting strk <strk at keybit.net>:
> >>
> >>> dblasby wrote:
> >>>> Whenever I call the getCoordinates() function, I immediately make a 
> >>>> copy
> >>>> of the list by converting it from the GEOS coordinates to PostGIS 
> >>>> POINT3Ds.
> >>>>
> >>>> So, feel free to have all the Postgis GetCoordinates() access 
> >>>> read-only
> >>>> version.
> >>>
> >>> I've made a specialized POINT3D-array-from-Polygon function using 
> >>> read-only
> >>> direct access to internal Polygon's LinearRings. Speed up is to 
> >>> evaluate
> >>> but personally I can not really appreciate it...
> >>>
> >>>> UNIT probably spending a lot of time building Postgis and GEOS 
> >>>> geometries.
> >>>>
> >>>> 1. start with 2 input postgis geometrys
> >>>> 2. convert them to GEOS (O(n) where n=# of subcomponents)
> >>>> 3. run union
> >>>> 4. convert back to postgis (O(n) where n=# of subcomponents)
> >>>> 5. start again at step #1
> >>>>
> >>>> As you can see there possibly O(n^2) constructions (and certainly 
> >>>> O(n^2)
> >>>> point copies).
> >>>>
> >>>> If you take an input dataset of disjoint polygons, you'll see this 
> >>>> is
> >>>> O(n^2) in terms of geometry convertions and copying.
> >>>
> >>> Do you think defining an ad-hoc postgresql type will give it a burst 
> >>> ?
> >>>
> >>> --strk;
> >>>
> >>> _______________________________________________
> >>> geos-devel mailing list
> >>> geos-devel at geos.refractions.net
> >>> http://geos.refractions.net/mailman/listinfo/geos-devel
> >>>
> >>
> >>
> >>
> >>
> >>
> >> _______________________________________________
> >> geos-devel mailing list
> >> geos-devel at geos.refractions.net
> >> http://geos.refractions.net/mailman/listinfo/geos-devel
> >>
>       Paul Ramsey
>       Refractions Research
>       Email: pramsey at refractions.net
>       Phone: (250) 885-0632
> 
> 
> _______________________________________________
> geos-devel mailing list
> geos-devel at geos.refractions.net
> http://geos.refractions.net/mailman/listinfo/geos-devel