[postgis-devel] LWGEOM -- inital version ready for testing
strk
strk at keybit.net
Tue May 4 16:08:02 PDT 2004
On Tue, May 04, 2004 at 10:34:39AM -0700, David Blasby wrote:
> Mark Cave-Ayland wrote:
>
> > Firstly a big well done for the work you've put into the LWGEOM - I had
> > hoped to have taken a more active role in developing the code but
> > unfortunately my time has been needed on other things :)
>
> Taking a look at it will help me a lot.
>
> > Anyway, haven't seen many responses from the other guys on the list so
>
> Ya - I wasnt feeling The Love...
Sorry Dave, I haven't had much time (GEOS debugging...).
You did a really great job !
The first attempt at building lwgeom was successful, apart from
missing prototypes in the parsers, but if I try to build it
now I get a lot of errors:
wktparse.lex:22: error: `lwg_parse_yylval' undeclared (first use in this functio
n)
wktparse.lex:22: error: (Each undeclared identifier is reported only once
wktparse.lex:22: error: for each function it appears in.)
wktparse.lex:22: error: `VALUE' undeclared (first use in this function)
wktparse.lex:25: error: `WKB' undeclared (first use in this function)
wktparse.lex:28: error: `POINT' undeclared (first use in this function)
wktparse.lex:29: error: `LINESTRING' undeclared (first use in this function)
wktparse.lex:30: error: `POLYGON' undeclared (first use in this function)
wktparse.lex:31: error: `MULTIPOINT' undeclared (first use in this function)
wktparse.lex:32: error: `MULTILINESTRING' undeclared (first use in this function)
wktparse.lex:33: error: `MULTIPOLYGON' undeclared (first use in this function)
wktparse.lex:34: error: `GEOMETRYCOLLECTION' undeclared (first use in this function)
wktparse.lex:35: error: `SRID' undeclared (first use in this function)
wktparse.lex:36: error: `EMPTY' undeclared (first use in this function)
wktparse.lex:38: error: `LPAREN' undeclared (first use in this function)
wktparse.lex:39: error: `RPAREN' undeclared (first use in this function)
wktparse.lex:40: error: `COMMA' undeclared (first use in this function)
wktparse.lex:41: error: `EQUALS' undeclared (first use in this function)
wktparse.lex:42: error: `SEMICOLON' undeclared (first use in this function)
Does it have to do with bison/yacc differences ?
--strk;
>
> > 2. At this stage it think it would be useful to include some sort of
> > debugging function to
> > dump the LWGEOM format into a human-readable form, for example:
> >
> > > SELECT raw_lwgeom(geom) FROM geomtable;
> >
> > | size | has_srid | has_bbox | dimensionality | wkb_type |
> > wkt_geom | wkb_geom |
> >
> > +------+----------+----------+----------------+----------+----------+---
> > -------+
> >
> >
> > I can see this being useful for helping users in situations where
> > they are not sure
> > exactly how they have configured the particular geometry in the
> > table. What do you
> > think?
>
> I'll see what I can do. Its a bit tricky to have a single command
> return multiple columns of data, but I can easily make
> lwgeom_rawinfo_size()-type functions and a summary function that would
> give a text version of all the above info.
>
> > 3. Should we prevent users from adding bounding boxes to point columns?
> > (i.e. is the
> > single/double precision conversion fast enough to make this a waste
> > of disk space?)
>
> I think we should make a decision on if we're going to always have
> bbounding box automatically added or not.
>
> For most people, not having bounding boxes is the "best" option - the
> geometries are small, and simple queries arent noticably slower.
>
> For queries like:
> SELECT * FROM <table> WHERE lwgeom && '<geom>';
>
> You will not miss the bounding boxes inside the geometries because it
> will be looking at the pre-generated bounding boxes in the index.
> Unfortunately, because of the way GiST does its searching this is
> actually faster:
>
> SELECT * FROM <table> WHERE lwgeom && AddBBox('<geom>');
>
> Because GiST will ask for the bounding box of the search geometry many
> many times during the index scan (once for every level in the tree, then
> once for each tuple in the index leaf [about 140]). You'll probably not
> notice a speed difference as it usually just a few milliseconds. I
> tried to get GiST to pre-cache the bounding box of the search geometry,
> but I havent been able to do it - its a bit silly.
>
>
> When you start cross-joining tables - a query that does a lot of
> sub-index scans, adding the bounding box significantly improves
> performance. Crossing a 10,000 row table with itself takes about 2
> second when there's bounding boxes but about 20 seconds when there's not.
>
> > 4. I think it would be useful to include BBOXes in LWGEOM by default. My
> > thinking here
> > would be that those users who would be knowlegable enough to be
> > concerned about the
> > space saving will more than likely to be knowledgable enough to
> > remove it where as new
> > users may get confused when certain queries that used to perform well
> > in PostGIS
> > perform poorly using LWGEOM.
>
> This is a good point.
>
> > 5. As far as I can see, assuming a non-index scan, the LWGEOM operators
> > call the box2d_*
> > functions directly which is defined using float4s. It looks like this
> > is contrary to
> > the OGC spec since all coordinates (and therefore I would guess
> > operators) are defined
> > as doubles? :(.
> >
> > I guess that we would need to maintain a box2d type which uses
> > doubles as well as
> > floats and use this for all the LWGEOM operators/functions (the box2d
> > float4 would
> > still be used for the indexes). Here it would be compulsory to add
> > RECHECK to the
> > operator classes since when expanding the box2d(double) to
> > box2d(float) extra
> > geometries may be returned by an overlap calculation. The RECHECK
> > would ensure that
> > these would be stripped out before the result set was returned.
>
> None of the LWGEOM operators are defined by the OGC - they're there
> because the GiST index needs them. When you do a "<geom1> && <geom2>",
> you should actually be calling the GEOS "intersects(geom1,geom2)".
>
> I must admit that the only operator I've actually ever used is the "&&".
> The way BOX2Ds are formed, you'll always get an 'appropriate' answer.
>
> Its a bit more complex for the other operators, but you'll usually get
> the correct answer.
>
> If you want to do things in double-precision, you can create
> double-precision bounding boxes (BOX3D) from lwgeoms "box3d(lwgeom)".
>
> I understand you point, but I think it's a lot of work (mostly
> computation) to do things in double when the single-precision results
> are "good enough".
>
> If people feel strongly on this, it isnt difficult to make the change -
> but it will have to compute the double-precision bounding box every time
> since there's no way to pre-compute it.
>
> > 6. It looks like you can add bboxes and srids to a geometry but not
> > remove them?
>
> Ya - I need to add this ability. I'll add it once we decide if we're
> going to have bounding boxes by default or not.
>
> >Also if I
> > follow the instructions in the README to add bounding boxes to a
> > column:
> >
> > DROP INDEX <lwgeom index name>;
> > UPDATE <table> SET <lwgeom column> = AddBBOX(<lwgeom column>);
> > CREATE INDEX <lwgeom index name> ON <table> USING GIST
> > (<lwgeom column> GIST_LWGEOM_OPS);
> > VACUUM ANALYSE <table>;
> >
> > That would seem to work great. My question is now if I add more
> > geometries to the
> > table, would these new geometries not have a bounding box attached?
> > Would the only way
> > to add bounding boxes to these geometries be by running the update
> > query and reindexing
> > again?
>
> Yes - this is a problem. You'll have a mix of geometries with and
> without bounding boxes.
>
> > Anyway glad to see that this stuff is becoming a reality, and I hope
> > that this will get some discussion started on the list about your work
> > so far :) If I get a chance I will try and setup a test database with
> > some sample data and see how it performs....
>
> You should find that it performs a wee bit slower than postgis, but it
> takes *significantly* less space.
>
> To find out how "big" things are:
>
> 1) vacuum analyse; --- very important!
> 2) SELECT relname, relpages FROM pg_class ORDER BY relpages DESC;
> -- gives size in disk pages (probably 8k)
> You'll notice that there will often be a toast table associated with a
> geometry table if you have "large" geometries.
>
>
> Thanks for the comments, looks like theres a few things to change.
>
> dave
>
> _______________________________________________
> postgis-devel mailing list
> postgis-devel at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-devel
More information about the postgis-devel
mailing list