[postgis-users] Re: Speeding up SQLiteGIS

Stephen Woodbridge woodbri at swoodbridge.com
Sun Apr 13 21:28:44 PDT 2008


Martin Chapman wrote:
> Stephen,
> 
> 1.  You should use OGR in my opinion for a large list of reasons such as
> comprehensive spatial projection support, format conversion support, quad
> tree indexes and other features. 

Well I do not inherently have an objection to this, but I would prefer 
that my applications do not require GDAL/OGR as a dependency because 
that makes deploying them and order of magnitude more complicated with 
all the additional package dependencies.

> 2.  Yes there is a C and C++ API for OGR and GDAL.  www.gdal.org See their
> example page for where to start writing code.  I would write a simple app
> that uses OGR to access a SQLite database and read out all the geometries of
> a table.  You can also write the code to insert a shape file into SQLite.
> See the source code for ogr2ogr.cpp to see how to do that.  It comes with
> GDAL. 

OK, I can look at this, I was just getting used to accessing SQLite from 
C and was hoping that we would add geometries, which we seem to have 
done, and spatial indexes.

> 3.  Not sure what you mean by "extended SQLite with GIS"?  OGR is exactly
> that if I understand you correctly.  OGR can import\export shapefiles and
> other vectors formats in\out of SQLite.  In doing so it creates a geometry
> column, handles insertion and extraction of OGC WKT geometries and other
> issues when using geometry in a SQLite database.  Use the program ogr2ogr to
> insert a shape file into SQLite and then use a SQLite viewer to inspect the
> contents of the table made by ogr2ogr.  You will see what I mean.  Without
> OGR one would have to code all that stuff and it isn't trivial because you
> need code to handle all types of OGC WKT geometries, etc.    

I'm thinking along the same lines as postGIS but in SQLite, so I can 
access it via C api and SQL. I think having support for SQLite in OGR is 
great. I use ogr2ogr on occasion and it is great for transforming data. 
I just don't understand the logic of why I have to use ogr C-api to get 
access to SQLite. It makes sense to load data through it, but for the 
same reasons I don't use OGR when I'm working in PostGIS, I would think 
that that would also apply to SQlite.

> 4.  Wiki is a really good idea.  Maybe we can just use the GDAL wiki?
> Frank, Farris Bueller, anyone?
> 
> 5.  I think a good improvement to the SQLite OGR driver would be to add
> spatial reference support.  Currently, the OGR SQLite driver does not have
> that support which limits it's functionality as a GIS storage system.  Also,
> OGR is only vector.  I would imagine that storing geo-referenced imagery
> would be useful as well.  This would require adding a new driver for SQLite
> to GDAL as an image provider.  Do I hear "OSGEO summer of code" project?

I'm not sure what you mean by add suport to the SQLite OGR driver. Why 
would you not add support to SQLite directly, and then it would be 
available to OGR, but also every other consumer of SQLite. LIKE:

update spatialtable set the_geom = transform(the_geom, 26717);

Yes, I'm sure OGR needs to be aware of what SQLite supports and have the 
driver modified to support that functionality.

> What is it exactly that you guys want to create?  I am using SQLite and OGR
> for my viewer application but want to use it for a local geospatial catalog
> system as well.  I imagine a lot of what I do with it would be useful for
> other purposes as well.

I want to be able to build applications in C that use SQLite and are 
spatially enabled. I need to do queries like:

select * from geomtable where st_distance(makepoint(x,y), the_geom) < 0.013;

select * from geomtable where expand(makepoint(x,y), 0.013) && the_geom;

where these queries would use a spatial index on the the_geom column.


Best regards,
   -Stephen

> Best regards,
> Martin
> 
> 
> 
> 
> 
> -----Original Message-----
> From: Stephen Woodbridge [mailto:woodbri at swoodbridge.com] 
> Sent: Sunday, April 13, 2008 9:02 PM
> To: Martin Chapman
> Cc: punkish at eidesis.org; 'PostGIS Users Discussion'; 'Richard Greenwood'
> Subject: Re: Speeding up SQLiteGIS
> 
> OK, this is cool and evolving very quickly. So if I just want to write a 
> C program using sqlite and GIS, where do I start.
> 
> Do I need to bring in GDAL/OGR now?
> Is there a C-api for that?
> 
> Does this mean that no one is pursuing just extended SQLite with GIS?
> 
> Sorry, I'm getting lost here.
> 
> Punkish, I think setting up a wiki somewhere and post some examples 
> would be very cool, and would be appreciated.
> 
> -Steve
> 
> Martin Chapman wrote:
>> Punkish,
>>
>> I shared similar code last year for postgresql and mysql but it never made
>> it into the branch.
>>
>> Here is the first change for the GetFeature() function of the sqllite
>> driver.  It is changed to use a new select statement (because no cursors
> in
>> sqlite).  I will pretty it up and send the final version later.
>>
>>
>> 1.  replace OGRSQLiteSelectLayer def in
>> /GDAL/ogr/ogrsf_frmts/sqlite/ogr_sqlite.h
>>
>> /************************************************************************/
>> /*                         OGRSQLiteSelectLayer                         */
>> /************************************************************************/
>>
>> class OGRSQLiteSelectLayer : public OGRSQLiteLayer
>> {
>> 	CPLString			m_sTableName;
>>   public:
>>                         OGRSQLiteSelectLayer( OGRSQLiteDataSource *, 
>>                                               sqlite3_stmt * );
>>                         ~OGRSQLiteSelectLayer();
>>
>>     virtual void        ResetReading();
>>     virtual int         GetFeatureCount( int );
>>
>>     virtual OGRFeature *GetFeature( long nFeatureId );
>>     
>>     virtual int         TestCapability( const char * );
>>
>>     virtual void	ClearStatement();
>> 	virtual void	SetTableName(const char* pszTableName);
>> };
>>
>>
>> 2.  Add the following block of code to
>> /GDAL/ogr/ogrsf_frmts/sqlite/ogrsqlitedatasource.cpp ExecuteSQL() method
> at
>> the end of the function just before the line " return poLayer;";
>>
>> 	std_string sSQLStatement = pszSQLCommand;
>> 	std_string sTableName;
>> 	transform(sSQLStatement.begin(), sSQLStatement.end(),
>> sSQLStatement.begin(), tolower);
>> 	int n = (int) sSQLStatement.find("from");
>> 	if (n != -1)
>> 	{
>> 		sTableName = sSQLStatement.substr(n, sSQLStatement.length()
>> - n);
>> 		replace(sTableName, std_string("from "), std_string(""));
>> 		trim(sTableName, std_string(" "));
>> 		trim(sTableName, std_string("'"));
>>
>> 		n = (int) sTableName.rfind("where");
>> 		if (n != -1)
>> 			sTableName = sTableName.substr(0, n);
>>
>> 		n = (int) sTableName.rfind("order by");
>> 		if (n != -1)
>> 			sTableName = sTableName.substr(0, n);
>>
>> 		n = (int) sTableName.rfind("'");
>> 		if (n != -1)
>> 			sTableName = sTableName.substr(0, n);
>> 		else
>> 		{
>> 			n = (int) sTableName.find(" ");
>> 			if (n != -1)
>> 				sTableName = sTableName.substr(0, n);
>> 		}
>>
>> 		trim(sTableName, string(" "));
>> 		poLayer->SetTableName(sTableName.c_str());
>> 	}
>>
>>
>> 3.  The following code replaces the GetFeature() method in
>> GDAL/ogr/ogrsf_frmts/sqlite/ogrsqliteselectlayer.cpp file.
>>
>> /************************************************************************/
>> /*                          GetFeature()                           */
>> /************************************************************************/
>>
>> OGRFeature *OGRSQLiteSelectLayer::GetFeature( long nFeatureId )
>> {
>> 	iNextShapeId = nFeatureId;
>>
>> 	CPLString osCommand;
>> 	osCommand.Printf("SELECT * FROM %s WHERE OGC_FID = '%d' Limit 1",
>> m_sTableName.c_str(), iNextShapeId);
>>
>> 	sqlite3_stmt* hStmt = NULL;
>> 	if (sqlite3_prepare(poDS->GetDB(), osCommand, (int)
>> osCommand.size(), &hStmt, NULL ) != SQLITE_OK)
>> 		return NULL;
>>
>> 	int rc = sqlite3_step( hStmt );
>> 	if( rc != SQLITE_ROW )
>>     {
>>         ClearStatement();
>>         return NULL;
>>     }
>> 	
>>     int iField;
>>     OGRFeature *poFeature = new OGRFeature( poFeatureDefn );
>>     if( pszFIDColumn != NULL )
>>     {
>>         int iFIDCol;
>>
>>         for( iFIDCol = 0; iFIDCol < sqlite3_column_count(hStmt); iFIDCol++
> )
>>         {
>>             if( EQUAL(sqlite3_column_name(hStmt,iFIDCol),
>>                       pszFIDColumn) )
>>                 break;
>>         }
>>
>>         if( iFIDCol == sqlite3_column_count(hStmt) )
>>         {
>>             CPLError( CE_Failure, CPLE_AppDefined, 
>>                       "Unable to find FID column '%s'.", 
>>                       pszFIDColumn );
>>             return NULL;
>>         }
>>         
>>         poFeature->SetFID( sqlite3_column_int( hStmt, iFIDCol ) );
>>     }
>>     else
>>         poFeature->SetFID( iNextShapeId );
>>
>>     iNextShapeId++;
>>     m_nFeaturesRead++;
>>
>>     if( pszGeomColumn != NULL )
>>     {
>>         int iGeomCol;
>>
>>         for( iGeomCol = 0; iGeomCol < sqlite3_column_count(hStmt);
>> iGeomCol++ )
>>         {
>>             if( EQUAL(sqlite3_column_name(hStmt,iGeomCol),
>>                       pszGeomColumn) )
>>                 break;
>>         }
>>
>>         if( iGeomCol == sqlite3_column_count(hStmt) )
>>         {
>>             CPLError( CE_Failure, CPLE_AppDefined, 
>>                       "Unable to find Geometry column '%s'.", 
>>                       pszGeomColumn );
>>             return NULL;
>>         }
>>
>>         char *pszWKTCopy, *pszWKT = NULL;
>>         OGRGeometry *poGeometry = NULL;
>>
>>         pszWKT = (char *) sqlite3_column_text( hStmt, iGeomCol );
>>         pszWKTCopy = pszWKT;
>>         if( OGRGeometryFactory::createFromWkt( &pszWKTCopy, NULL, 
>>                                                &poGeometry ) ==
> OGRERR_NONE
>> )
>>             poFeature->SetGeometryDirectly( poGeometry );
>>     }
>>
>>     for( iField = 0; iField < poFeatureDefn->GetFieldCount(); iField++ )
>>     {
>>         OGRFieldDefn *poFieldDefn = poFeatureDefn->GetFieldDefn( iField );
>>         int iRawField = panFieldOrdinals[iField] - 1;
>>
>>         if( sqlite3_column_type( hStmt, iRawField ) == SQLITE_NULL )
>>             continue;
>>
>>         switch( poFieldDefn->GetType() )
>>         {
>>         case OFTInteger:
>>             poFeature->SetField( iField, 
>>                 sqlite3_column_int( hStmt, iRawField ) );
>>             break;
>>
>>         case OFTReal:
>>             poFeature->SetField( iField, 
>>                 sqlite3_column_double( hStmt, iRawField ) );
>>             break;
>>
>>         case OFTBinary:
>>             {
>>                 const int nBytes = sqlite3_column_bytes( hStmt, iRawField
> );
>>                 poFeature->SetField( iField, nBytes,
>>                     (GByte*)sqlite3_column_blob( hStmt, iRawField ) );
>>             }
>>             break;
>>
>>         case OFTString:
>>             poFeature->SetField( iField, 
>>                 (const char *) 
>>                 sqlite3_column_text( hStmt, iRawField ) );
>>             break;
>>
>>         default:
>>             break;
>>         }
>>     }
>>
>> #ifdef notdef
>>     if( pszGeomColumn != NULL )
>>     {
>>         int iField = poStmt->GetColId( pszGeomColumn );
>>         const char *pszGeomText = poStmt->GetColData( iField );
>>         OGRGeometry *poGeom = NULL;
>>
>>         if( pszGeomText != NULL )
>>             OGRGeometryFactory::createFromWkt( (char **) &pszGeomText,
>>                                                NULL, &poGeom );
>>         
>>         if( poGeom != NULL )
>>             poFeature->SetGeometryDirectly( poGeom );
>>     }
>> #endif
>>
>> 	if (hStmt)
>> 		sqlite3_finalize(hStmt);
>>
>>     return poFeature;
>> }
>>
>>
>> 4.  Rebuild gdal.  Note that this change only optimizes the
> GetFeature(long
>> nFeatureId) method of a layer opened by a call to ExecuteSQL().  At least
>> random access calls are fast then.  I can also make the change in the
>> OGRSQLiteLayer and ogrsqlitetablelayer classes.  The function will always
>> work for random access calls because WHERE, ORDER BY and all other sql
>> clauses are irrelevant when accessing a row by ID.  Note that
>> GetNextFeature() will still be slow although we could speed up that
> function
>> as well.
>>
>>
>>
>> ALSO here is the changes for MYSQL and POSTGRESQL for optimized random
>> access.
>>
>> MYSQL:
>>
>> /************************************************************************/
>> /*                          GetFeature()                           */
>> /************************************************************************/
>>
>> OGRFeature *OGRMySQLResultLayer::GetFeature( long nFeatureId )
>> {
>> 	iNextShapeId = bHasFid == TRUE ? nFeatureId - 1 : nFeatureId;
>>
>> /* -------------------------------------------------------------------- */
>> /*      Do we need to establish an initial query?                       */
>> /* -------------------------------------------------------------------- */
>>     if( hResultSet == NULL )
>>     {
>>         CPLAssert( pszQueryStatement != NULL );
>>
>>         poDS->RequestLongResult( this );
>>
>>         if( mysql_query( poDS->GetConn(), pszQueryStatement ) )
>>         {
>>             poDS->ReportError( pszQueryStatement );
>>             return NULL;
>>         }
>>
>>         hResultSet = mysql_store_result( poDS->GetConn() );
>>         if( hResultSet == NULL )
>>         {
>>             poDS->ReportError( "mysql_use_result() failed on query." );
>>             return FALSE;
>>         }
>>     }
>>
>> /* -------------------------------------------------------------------- */
>> /*      Fetch absolute record.
>> */
>> /* -------------------------------------------------------------------- */
>>     char **papszRow = NULL;
>>     unsigned long *panLengths = NULL;
>> 	OGRFeature *poFeature = NULL;
>>
>>     if ( mysql_num_rows( hResultSet ) );
>> 	{
>> 		mysql_data_seek( hResultSet, iNextShapeId );
>>
>> 		papszRow = mysql_fetch_row( hResultSet );
>> 		if( !papszRow ) return NULL;
>>
>> 		panLengths = mysql_fetch_lengths( hResultSet );
>> 		if ( !panLengths ) return NULL;
>>
>> /* -------------------------------------------------------------------- */
>> /*      Process record.                                                 */
>> /* -------------------------------------------------------------------- */
>> 		poFeature = RecordToFeature( papszRow, panLengths );
>> 	}
>>
>>     return poFeature;
>> }
>>
>>
>> POSTGRESQL:
>>
>>
>> /************************************************************************/
>> /*                             GetFeature()                             */
>> /************************************************************************/
>>
>> OGRFeature *OGRPGResultLayer::GetFeature( long nFeatureId )
>>
>> {
>> 	if (bHasFid == TRUE && nFeatureId < 1) return NULL;
>> 	PGconn      *hPGConn = poDS->GetPGConn();
>>     CPLString   osCommand;
>> 	iNextShapeId = nFeatureId;
>>
>> 	if (hCursorResult != NULL)
>> 	{
>> 		PQclear( hCursorResult );
>> 		poDS->FlushSoftTransaction();
>> 	}
>>
>> 	poDS->SoftStartTransaction();
>>     osCommand.Printf( "DECLARE %s CURSOR for %s", pszCursorName,
>> pszQueryStatement );
>>     hCursorResult = PQexec(hPGConn, osCommand );
>>
>> 	PQclear( hCursorResult );
>>     osCommand.Printf( "FETCH ABSOLUTE %d in %s", nFeatureId, pszCursorName
>> );
>>     hCursorResult = PQexec(hPGConn, osCommand );
>>     OGRFeature *poFeature = RecordToFeature( 0 );
>>     return poFeature;
>> }
>>
>>
>> Best regards,
>> Martin
>>
>>
>> -----Original Message-----
>> From: P Kishor [mailto:punk.kish at gmail.com] 
>> Sent: Sunday, April 13, 2008 6:56 PM
>> To: Martin Chapman
>> Cc: PostGIS Users Discussion; Richard Greenwood; Stephen Woodbridge
>> Subject: Re: Speeding up SQLiteGIS
>>
>> On 4/13/08, Martin Chapman <mchapman at texelinc.com> wrote:
>>> I'm working on it right now and will send you the code when I finish.
>>>  Basically SQLite is fast and the OGR code that uses sqlite is slow
>> because
>>>  of the way they wrote it.  My changes make OGR use random access versus
>> what
>>>  they do today and scroll from 0 to n for each record requested.  That
>> means
>>>  that row one will return quick but for each sub-sequent request the code
>>>  loops until it finds the record requested.  Very inefficient.  It has
>> speed
>>>  it up by about 50x on the very small shape file I am using to test.
>> Fantastic! Please consider sharing the code with Frank. He is the
>> ultimate custodian for OGR, and we want OGR itself to become better to
>> the extent that it can.
>>
>> In the meantime, Frank, thanks yet again for making this possible (and
>> for help with the OGR syntax).
>>
>> la la la (me happy again).
>>
>>
>>>  Best regards,
>>>  Martin
>>>
>>>  -----Original Message-----
>>>
>>> From: P Kishor [mailto:punk.kish at gmail.com]
>>>  Sent: Sunday, April 13, 2008 6:41 PM
>>>  To: PostGIS Users Discussion
>>>
>>> Cc: Richard Greenwood; Stephen Woodbridge; Martin Chapman
>>>  Subject: Speeding up SQLiteGIS
>>>
>>>  (started a new thread)
>>>
>>>
>>>
>>>  On 4/13/08, Martin Chapman <mchapman at texelinc.com> wrote:
>>>  > I can connect to SQLite using OGR in my application as seen in the
>>>  >  attachment.  It's kind of slow compared to shape but I can optimize
>> the
>>>  OGR
>>>  >  code for SQLite like I did for PostgreSQL and MySQL in my local copy
>> of
>>>  the
>>>  >  gdal code. Then it's as fast as shape.  I really like SQLite as well.
>>>  What
>>>  >  is the address of your WMS server?
>>>
>>>  Martin,
>>>
>>>  I am right now working on my laptop, but will soon put it on a
>>>  publicly accessible server. I am fooling around with a few tricks, and
>>>  now I am getting the entire table being selected *and* rendered in
>>>  about 92 ms. I think that is *very* good (this is a table of all the
>>>  zip code boundaries in the State of Wisconsin, US -- 1121 rows and all
>>>  the river segments -- > 183K rows).
>>>
>>>  Now, could you kindly share the "optimization" of OGR code *for*
>>>  SQLite? Have you considered submitted it back to Frank for possible
>>>  inclusion in the main branch?
>>>
>>>  Also, we could request OSGeo to set up some space for us to share
>>>  SQLite code, tricks, etc. In the meantime, I have bagged sqlitegis.org
>>>  and sqlitegis.com, and would be happy to host a wiki/list or whatever
>>>  crap the dreamhost allows me to host with a single-click install.
>>>
>>>  Seriously, this the first time in years that I have been excited by
>>>  all this mapping business. SQLite is just such an incredibly powerful
>>>  tool, not for its power and performance (which is plenty fast for me)
>>>  but for its incredible simplicity. The fasted program for me is the
>>>  one that allows me to install, setup and program the fastest.
>>>
>>>
>>>  >
>>>  >  Best regards,
>>>  >  Martin
>>>  >
>>>  >
>>>  >  -----Original Message-----
>>>  >  From: postgis-users-bounces at postgis.refractions.net
>>>  >  [mailto:postgis-users-bounces at postgis.refractions.net] On Behalf Of P
>>>  Kishor
>>>  >  Sent: Sunday, April 13, 2008 1:51 PM
>>>  >  To: PostGIS Users Discussion; Richard Greenwood
>>>  >  Subject: Re: [postgis-users] SQLite and postGIS
>>>  >
>>>  >
>>>  > On 4/13/08, Richard Greenwood <richard.greenwood at gmail.com> wrote:
>>>  >  > I should note that ogr2ogr creates a SQLite spatial table even more
>>>  >  >  easily that SpatiaLite:
>>>  >  >    ogr2ogr -f "SQLite" dest.db source.shp source
>>>  >
>>>  >  Thanks Rich. You made my day. I am drawing a layer with about 185K
>>>  >  rows, and, yes, there is a time lag, but this is out-of-the-box
>>>  >  performance.
>>>  >
>>>  >  One question -- the above ogr2ogr command, for me, does not seem to
>>>  >  cooperate when adding multiple shapefiles to the same db. If I do
> like
>>>  >  so
>>>  >
>>>  >  ogr2ogr -f "SQLite" dest.db source1.shp source2.shp source3.shp
>>>  >
>>>  >  I get a dest.db with 0 Kb. I got around that by creating dest1.db,
>>>  >  dest2.db and so on, and then creating dest.db, attaching the other
>>>  >  dbs, and moving the tables into dest.db via
>>>  >
>>>  >  CREATE TABLE ... AS SELECT * FROM....
>>>  >
>>>  >  This seems a bit goofy, so I am sure I am doing something wrong with
>>>  >  the ogr command. Suggestions?
>>>  >
>>>  >  Another note -- I wonder if this merits starting a separate SQLitegis
>>>  >  mailing list? I am happy here as long as the pg users don't mind.
>>>  >
>>>  >
>>>  >  >
>>>  >  >  Rich
>>>  >  >
>>>  >  >
>>>  >  >
>>>  >  >
>>>  >  >  On Sun, Apr 13, 2008 at 9:26 AM, Richard Greenwood
>>>  >  >  <richard.greenwood at gmail.com> wrote:
>>>  >  >  > On Fri, Apr 11, 2008 at 9:29 PM, Frank Warmerdam
>>>  <warmerdam at pobox.com>
>>>  >  wrote:
>>>  >  >  >  >  Puneet, Rich,
>>>  >  >  >  >
>>>  >  >  >  >  SQLite is already supported as a spatial database by OGR.
>> The
>>>  >  caveat
>>>  >  >  >  >  is that in GDAL 1.5 it is just using a text column with WKT
>>>  >  geometries so
>>>  >  >  >  >  the spatial performance is not great.
>>>  >  >  >  >
>>>  >  >  >  >  To use this with MapServer you would use CONNECTIONTYPE OGR
>> and
>>>  the
>>>  >  >  >  >  CONNECTION string would be the path for the sqlite database.
>>>  The
>>>  >  >  >  >  DATA statement should hold the table name be accessed.
>>>  >  >  >
>>>  >  >  >  Totally cool!
>>>  >  >  >
>>>  >  >  >  I used SpatiaLite (http://www.gaia-gis.it/spatialite/)
>>>  LoadShapefile()
>>>  >  >  >  function to import a shapefile into a SQLite db. The geometries
>> are
>>>  >  >  >  stored as BLOBs in a field named "geom" Then:
>>>  >  >  >    sqlite> alter table ownership add column WKT_GEOMETRY;
>>>  >  >  >    sqlite> update ownership set WKT_GEOMETRY=astext(geom);
>>>  >  >  >
>>>  >  >  >  Getting MapServer to use the SQLite table was very easy. Recent
>>>  >  >  >  versions of MS4W have SQLite support in GDAL. So simply adding
>>>  >  >  >    CONNECTIONTYPE OGR
>>>  >  >  >    CONNECTION "path/to/SQLite.db"
>>>  >  >  >  gets MapServer drawing geometries from SQLite.
>>>  >  >  >
>>>  >  >  >  I'm playing with a table containing about 15,000 polygons and
>>>  >  >  >  performance is fine.
>>>  >  >  >
>>>  >  >  >  Thanks Frank, for pointing me in the right direction.
>>>  >  >  >
>>>  >  >  >  Rich
>>>  >  >  >
>>>
>>>
>>>
>>>  --
>>>  Puneet Kishor http://punkish.eidesis.org/
>>>  Nelson Institute for Environmental Studies http://www.nelson.wisc.edu/
>>>  Open Source Geospatial Foundation (OSGeo) http://www.osgeo.org/
>>>
>>>
> 
> 




More information about the postgis-users mailing list