[fdo-internals] FDO RFC 16 - FDO Provider for SQLite

Kenneth, GEOGRAF A/S ks at geograf.dk
Wed Mar 26 07:24:44 EDT 2008


I see.

I did not know about the column count/index problem for SQLite.
I can see that in a purely MapGuide context, there would be great 
benefit from inital parsing of all geometry.
I was thinking in a more general context, where the database would 
open/close regularly.
In such a context, creating a search tree from scratch seems like a big 
deal.

Thank you for clearing that up.

Regards, Kenneth, GEOGRAF A/S



Traian Stanev skrev:
>
> >>I can't figure out how to extract/compute the BBOX from an SQLite 
> BLOB, without parsing the geometry.
>
>  
>
> You are right, there is no other way to get the bounding box.
>
>
> >>I belive that it is faster to read the BBOX values rather than a 
> potentially large BLOB field.
>
>  
>
> This is debatable. One thing about SQLite is that its read performance 
> is **VERY** sensitive to how many columns there are in the table. If 
> the BBOX columns are the last ones in a 50 column table, and the 
> geometry is the first column, you can probably compute the bounds 
> faster than you can offset all the way to the end of the row. On the 
> other hand, if you put the BBOX columns first, you handicap access to 
> all other columns by 4 columns. Also, in a huge segment of GIS data 
> (namely point data), computing the bounds from the BLOB would be 
> faster than getting it from the BBOX columns.
>
> >>I agree that it would use some space to add the columns, but adding 
> the columns would enable a fast filtering of unused features.
> >>As for the 4 times index, one could merely add index for the x 
> values, and then use sequential filtering for the y values.
>
>  
>
> Yes, such a scheme is possible and you would reduce the spatial search 
> to two B-Tree searches. It also means you end up with a linear worst 
> case search, if your search bounds is really wide but not very tall 
> for example. This sounds like a lousy deal for adding ~40 bytes' worth 
> of stuff to every row, not counting the index tables, of which you 
> need two (one for minx and one for maxx). Index tables would cost 
> something like 10 bytes per row at least (I am not sure exactly how 
> SQLite implements indexes, but you have to figure it has at least the 
> double value in there, plus a row offset for it).
>
>  
>
>
> >> Is your internal spatial index algorithm absolutely without benefit 
> from added BBOX columns?
>
>  
>
> It computes bounding boxes and the spatial index on the fly, during 
> the first table scan after the FDO connection is open. This will play 
> well with FDO clients like MapGuide which keep connections pooled. In 
> my benchmarks, computing bounding boxes has not been a big overhead. 
> The cost of the spatial index in its prototype form, per feature, is 
> roughly 20 bytes in RAM.
>
>  
>
> So to sum up, you can add BBOX columns to your SQLite databases if it 
> helps you with spatial searches. The provider does not need them, so 
> it will not require them. It will also not complain if they are there 
> -- it would treat them as regular columns.
>
>  
>
>  
>
> Traian
>
>  
>
>  
>
>
>
>  
>
>  
>
> *From:* fdo-internals-bounces at lists.osgeo.org 
> [mailto:fdo-internals-bounces at lists.osgeo.org] *On Behalf Of *Kenneth, 
> GEOGRAF A/S
> *Sent:* Tuesday, March 25, 2008 4:54 PM
> *To:* FDO Internals Mail List
> *Subject:* Re: [fdo-internals] FDO RFC 16 - FDO Provider for SQLite
>
>  
>
> I can't figure out how to extract/compute the BBOX from an SQLite 
> BLOB, without parsing the geometry.
> I belive that it is faster to read the BBOX values rather than a 
> potentially large BLOB field.
>
> I agree that it would use some space to add the columns, but adding 
> the columns would enable a fast filtering of unused features.
> As for the 4 times index, one could merely add index for the x values, 
> and then use sequential filtering for the y values.
>
> Is your internal spatial index algorithm absolutely without benefit 
> from added BBOX columns?
>
>
> Regards, Kenneth, GEOGRAF A/S
>
>
>
> Traian Stanev skrev:
>
>  
>
> It's a waste of space to store BBOX. It would cost at least 32 bytes 
> per feature, not counting the indexes.
>
>  
>
> Also, you can easily compute the bounding box and add those 4 columns 
> very quickly when you need them, given that SQLite is pretty fast to 
> read from. Indexing on the 4 BBOX columns would make spatial queries 
> have to compute the intersection of the results of 4 tree searches. 
> SQLite is fast, but it's not **that** fast.
>
>  
>
> Anyway, the nice thing about this format is that you can create the 
> BBOX columns anyway, and use them. The FDO provider will not be using 
> it though, for performance reasons.
>
>  
>
>  
>
> Traian
>
>  
>
>  
>
>  
>
> *From:* fdo-internals-bounces at lists.osgeo.org 
> <mailto:fdo-internals-bounces at lists.osgeo.org> 
> [mailto:fdo-internals-bounces at lists.osgeo.org] *On Behalf Of *Kenneth, 
> GEOGRAF A/S
> *Sent:* Tuesday, March 25, 2008 7:12 AM
> *To:* FDO Internals Mail List
> *Subject:* Re: [fdo-internals] FDO RFC 16 - FDO Provider for SQLite
>
>  
>
> I just re-read the RFC.
>
> What are your reasons for not storing BBOX releated data in the data 
> table?
>
>
>
> I would suggest adding four columns, like minx, maxx, miny, max 
> (perhaps name them via. the geometry_columns table?).
> With that approach, and index on the pairs would move the BBOX query 
> load/optimization into the DBMS.
> BBOX queries are simple, and does not require reading the actual 
> column data.
>
> Your extended spatial index algorithm might read these values to 
> produce the search tree.
>
> Another suggestion, would be to store the search tree in a blob column.
> That would enable a much faster load of data.
>
> To avoid the requirement that all data updaters know the algorithm, it 
> might be legal to just clear the BLOB data on updates.
>
> I wrote a short paper on a similar mechanism some time ago, with a friend:
> http://www.hexad.dk/opensource/spatialdbms.pdf
>
>
>
>
> Regards, Kenneth, GEOGRAF A/S
>
>
>
> Traian Stanev skrev:
>
> Hello,
>
>  
>
> I just posted FDO RFC #16. Fresh off the virtual presses (printed on 
> 100% recycled electrons, for you green-minded folks):
>
>  
>
> http://trac.osgeo.org/fdo/wiki/FDORfc16
>
>  
>
>  
>
> Thanks,
>
> Traian
>
>  
>  
> ------------------------------------------------------------------------
>
>
>   
>  
>  
>   
>  
> _______________________________________________
> fdo-internals mailing list
> fdo-internals at lists.osgeo.org <mailto:fdo-internals at lists.osgeo.org>
> http://lists.osgeo.org/mailman/listinfo/fdo-internals
>   
>  
> ------------------------------------------------------------------------
>
>
>   
>  
> _______________________________________________
> fdo-internals mailing list
> fdo-internals at lists.osgeo.org <mailto:fdo-internals at lists.osgeo.org>
> http://lists.osgeo.org/mailman/listinfo/fdo-internals
>   
> ------------------------------------------------------------------------
>
> _______________________________________________
> fdo-internals mailing list
> fdo-internals at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/fdo-internals
>   
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.osgeo.org/pipermail/fdo-internals/attachments/20080326/0483c116/attachment.html


More information about the fdo-internals mailing list