[postgis-users] SQLite and postGIS
P Kishor
punkish at eidesis.org
Tue Apr 8 11:56:13 PDT 2008
(the reply may look screwy because I am temporarily saddled with a
hobbled gmail interface on IE something or the other, and it seems to
have a mind of its own)
I unpacked the shapefiles via a module on CPAN and stored the
coordinates as text strings (well, everything is a string for SQLite)
readymade for a p-in-p algorithm that I adapted from the Wolf book
(Algorithms with Perl or some such title -- Hietanemi et al). IIRC, I
stored all the x-coords as a string in one col, all the y-coords as a
string in another col, and that allowed me to reconstruct the arrays I
needed to do the p-in-p.
Iterating over all the geoms was decidedly silly -- 250k * 7.5m
results in more than a trillion transactions, if my math is correct.
Instead, I stored the bounding box of each shape in separate cols, and
as a first pass, use SQL selects to narrow down the points per poly.
Then ran the p-in-p on each set.
Re. advantages -- well, I first tried working with ArcGIS/SDE. An
existing solution was taking about 8 days from start to finish (a lot
of pre and post work on either side of the overlay), and that is if
the process didn't explode prematurely. My process took about 2.5 to 3
days. I do recall building the SQLite db with a bigger page size (I am
writing all this from memory. I am very far from my computer right
now).
I have briefly looked at the Spatialite solution and that does look
like a very cool start. In fact, it gives me enough incentive to try
and learn C.
I do have one question for you all -- why on earth does Spatialite
(and also PostGIS as well as ArcGIS) store the geometry as a BLOB?
What are the advantages? Is it space? Is it speed? Both? I can't
really understand the reason for all that AsWKT and WKB and AsPoints
and whatever back-and-forth. Why not just store coords as numbers, or
a string of coords as, well, a string of coords? Especially in the
case of PostGIS, PostgreSQL has a col type of arrays, so why not just
use that? SQLite stores everything as strings anyway, so why bother
with the back and forth unless there is speed advantage. I don't care
about the space -- hard disk is cheap.
Many thanks for this conversation. This is very interesting for me. I
would love to see a truly lightweight but truly real geospatial db
solution realized, and I am a committed fan of SQLite.
On 4/8/08, Martin Davis <mbdavis at refractions.net> wrote:
>
>
> P Kishor wrote:
> >
> > I actually wrote a pretty useful point-in-polygon routine using Perl
> > DBD::SQLite unwrapping Shapefiles into a SQLite db and then using
> > SQLite for boundary matching. It was for a very large p-in-p (7.5
> > million points against 250k polys) that ArcGIS was choking over. Works
> > very sweet in Perl/SQLite in about 20% to 30% of the time taken.
> >
> >
> >
> I'm curious about your SQLite spatial solution.
>
> What approach did you use for storing geometry? How did you load the
> shapefiles?
>
> Since SQLite has no spatial index, did you just iterate over all
> geometries for you PIP test? What was your PIP test written in?
>
> What was the advantage of doing this over just using the raw shapefiles
> - was it performance, or ease of data management?
>
> The SQLite Spatial direction seems like an interesting one to pursue for
> just this kind of bulk processing use case.
> >
> >
> >
>
> --
> Martin Davis
> Senior Technical Architect
> Refractions Research, Inc.
> (250) 383-3022
>
>
--
Puneet Kishor http://punkish.eidesis.org/
Nelson Institute for Environmental Studies http://www.nelson.wisc.edu/
Open Source Geospatial Foundation (OSGeo) http://www.osgeo.org/
More information about the postgis-users
mailing list