[postgis-devel] LWGEOM -- inital version ready for testing

strk strk at keybit.net
Tue May 4 16:08:02 PDT 2004


On Tue, May 04, 2004 at 10:34:39AM -0700, David Blasby wrote:
> Mark Cave-Ayland wrote:
> 
> > Firstly a big well done for the work you've put into the LWGEOM - I had
> > hoped to have taken a more active role in developing the code but
> > unfortunately my time has been needed on other things :)
> 
> Taking a look at it will help me a lot.
> 
> > Anyway, haven't seen many responses from the other guys on the list so
> 
> Ya - I wasnt feeling The Love...

Sorry Dave, I haven't had much time (GEOS debugging...).
You did a really great job !

The first attempt at building lwgeom was successful, apart from
missing prototypes in the parsers, but if I try to build it
now I get a lot of errors:

 wktparse.lex:22: error: `lwg_parse_yylval' undeclared (first use in this functio
 n)
 wktparse.lex:22: error: (Each undeclared identifier is reported only once
 wktparse.lex:22: error: for each function it appears in.)
 wktparse.lex:22: error: `VALUE' undeclared (first use in this function)
 wktparse.lex:25: error: `WKB' undeclared (first use in this function)
 wktparse.lex:28: error: `POINT' undeclared (first use in this function)
 wktparse.lex:29: error: `LINESTRING' undeclared (first use in this function)
 wktparse.lex:30: error: `POLYGON' undeclared (first use in this function)
 wktparse.lex:31: error: `MULTIPOINT' undeclared (first use in this function)
 wktparse.lex:32: error: `MULTILINESTRING' undeclared (first use in this function)
 wktparse.lex:33: error: `MULTIPOLYGON' undeclared (first use in this function)
 wktparse.lex:34: error: `GEOMETRYCOLLECTION' undeclared (first use in this function)
 wktparse.lex:35: error: `SRID' undeclared (first use in this function)
 wktparse.lex:36: error: `EMPTY' undeclared (first use in this function)
 wktparse.lex:38: error: `LPAREN' undeclared (first use in this function)
 wktparse.lex:39: error: `RPAREN' undeclared (first use in this function)
 wktparse.lex:40: error: `COMMA' undeclared (first use in this function)
 wktparse.lex:41: error: `EQUALS' undeclared (first use in this function)
 wktparse.lex:42: error: `SEMICOLON' undeclared (first use in this function)

Does it have to do with bison/yacc differences ?

--strk;

> 
> > 2. At this stage it think it would be useful to include some sort of
> > debugging function to 
> >    dump the LWGEOM format into a human-readable form, for example:
> > 
> > 	> SELECT raw_lwgeom(geom) FROM geomtable;
> > 	
> > 	| size | has_srid | has_bbox | dimensionality | wkb_type |
> > wkt_geom | wkb_geom | 
> > 	
> > +------+----------+----------+----------------+----------+----------+---
> > -------+
> > 
> > 
> >    I can see this being useful for helping users in situations where
> > they are not sure 
> >    exactly how they have configured the particular geometry in the
> > table. What do you
> >    think?
> 
> I'll see what I can do.  Its a bit tricky to have a single command 
> return multiple columns of data, but I can easily make 
> lwgeom_rawinfo_size()-type functions and a summary function that would 
> give a text version of all the above info.
> 
> > 3. Should we prevent users from adding bounding boxes to point columns?
> > (i.e. is the 
> >    single/double precision conversion fast enough to make this a waste
> > of disk space?)
> 
> I think we should make a decision on if we're going to always have 
> bbounding box automatically added or not.
> 
> For most people, not having bounding boxes is the "best" option - the 
> geometries are small, and simple queries arent noticably slower.
> 
> For queries like:
> SELECT * FROM <table> WHERE lwgeom && '<geom>';
> 
> You will not miss the bounding boxes inside the geometries because it 
> will be looking at the pre-generated bounding boxes in the index. 
> Unfortunately, because of the way GiST does its searching this is 
> actually faster:
> 
> SELECT * FROM <table> WHERE lwgeom && AddBBox('<geom>');
> 
> Because GiST will ask for the bounding box of the search geometry many 
> many times during the index scan (once for every level in the tree, then 
> once for each tuple in the index leaf [about 140]).  You'll probably not 
> notice a speed difference as it usually just a few milliseconds.  I 
> tried to get GiST to pre-cache the bounding box of the search geometry, 
> but I havent been able to do it - its a bit silly.
> 
> 
> When you start cross-joining tables - a query that does a lot of 
> sub-index scans, adding the bounding box significantly improves 
> performance.  Crossing a 10,000 row table with itself takes about 2 
> second when there's bounding boxes but about 20 seconds when there's not.
> 
> > 4. I think it would be useful to include BBOXes in LWGEOM by default. My
> > thinking here 
> >    would be that those users who would be knowlegable enough to be
> > concerned about the 
> >    space saving will more than likely to be knowledgable enough to
> > remove it where as new 
> >    users may get confused when certain queries that used to perform well
> > in PostGIS 
> >    perform poorly using LWGEOM.
> 
> This is a good point.
> 
> > 5. As far as I can see, assuming a non-index scan, the LWGEOM operators
> > call the box2d_* 
> >    functions directly which is defined using float4s. It looks like this
> > is contrary to 
> >    the OGC spec since all coordinates (and therefore I would guess
> > operators) are defined 
> >    as doubles? :(.
> > 
> >    I guess that we would need to maintain a box2d type which uses
> > doubles as well as 
> >    floats and use this for all the LWGEOM operators/functions (the box2d
> > float4 would  
> >    still be used for the indexes). Here it would be compulsory to add
> > RECHECK to the 
> >    operator classes since when expanding the box2d(double) to
> > box2d(float) extra 
> >    geometries may be returned by an overlap calculation. The RECHECK
> > would ensure that 
> >    these would be stripped out before the result set was returned.
> 
> None of the LWGEOM operators are defined by the OGC - they're there 
> because the GiST index needs them. When you do a "<geom1> && <geom2>", 
> you should actually be calling the GEOS "intersects(geom1,geom2)".
> 
> I must admit that the only operator I've actually ever used is the "&&". 
>   The way BOX2Ds are formed, you'll always get an 'appropriate' answer.
> 
> Its a bit more complex for the other operators, but you'll usually get 
> the correct answer.
> 
> If you want to do things in double-precision, you can create 
> double-precision bounding boxes (BOX3D) from lwgeoms "box3d(lwgeom)".
> 
> I understand you point, but I think it's a lot of work (mostly 
> computation) to do things in double when the single-precision results 
> are "good enough".
> 
> If people feel strongly on this, it isnt difficult to make the change - 
> but it will have to compute the double-precision bounding box every time 
> since there's no way to pre-compute it.
> 
> > 6. It looks like you can add bboxes and srids to a geometry but not
> > remove them? 
> 
> Ya - I need to add this ability.  I'll add it once we decide if we're 
> going to have bounding boxes by default or not.
> 
> >Also if I 
> >    follow the instructions in the README to add bounding boxes to a
> > column:
> > 
> >    DROP INDEX <lwgeom index name>;
> >    UPDATE <table> SET <lwgeom column> = AddBBOX(<lwgeom column>);
> >    CREATE INDEX <lwgeom index name> ON <table> USING GIST 
> >      (<lwgeom column> GIST_LWGEOM_OPS);
> >    VACUUM ANALYSE <table>;
> > 
> >    That would seem to work great. My question is now if I add more
> > geometries to the    
> >    table, would these new geometries not have a bounding box attached?
> > Would the only way   
> >    to add bounding boxes to these geometries be by running the update
> > query and reindexing 
> >    again?
> 
> Yes - this is a problem.  You'll have a mix of geometries with and 
> without bounding boxes.
> 
> > Anyway glad to see that this stuff is becoming a reality, and I hope
> > that this will get some discussion started on the list about your work
> > so far :) If I get a chance I will try and setup a test database with
> > some sample data and see how it performs....
> 
> You should find that it performs a wee bit slower than postgis, but it 
> takes *significantly* less space.
> 
> To find out how "big" things are:
> 
> 1) vacuum analyse; --- very important!
> 2) SELECT relname, relpages FROM pg_class ORDER BY relpages DESC; 	
> 		-- gives size in disk pages (probably 8k)
> You'll notice that there will often be a toast table associated with a 
> geometry table if you have "large" geometries.
> 
> 
> Thanks for the comments, looks like theres a few things to change.
> 
> dave
> 
> _______________________________________________
> postgis-devel mailing list
> postgis-devel at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-devel



More information about the postgis-devel mailing list