[gdal-dev] Add a new performance hint for Spatialite

Even Rouault even.rouault at mines-paris.org
Fri Mar 28 03:10:26 PDT 2014


Ah, got it. I read your message too fast. I've added your suggestions with a
simplified version of your below example in
http://trac.osgeo.org/gdal/changeset/27104

Another solution for your use case would be to make a OGR VRT with a
OGRVRTUnionLayer that reference every GML file and convert the VRT in a single
step in spatialite. But we don't have yet a ogrbuildvrt utility to create that
VRT in a user friendly way.

> Hi,
>
> Sorry, I must have not emphasized enough that hint is only valid for
> appending data with subsequent ogr2ogr commands
>
> ogr2ogr -f sqlite -dsco spatialite=yes mtk_tos.sqlite -dim 2 -lco
> spatial_index=no /vsizip/e:\mtk_tos\etrs89\gml\K32.zip tieviiva  -gt 65536
> ogr2ogr -f sqlite -append mtk_tos.sqlite -dim 2
> /vsizip/e:\mtk_tos\etrs89\gml\K34.zip tieviiva -gt 65536
> ... repeat the latter command for 113 more GML files...
>
> I am not sure if this is a common use case everywhere. However, our National
> Land Survey delivers vector data divided by map sheets which means that each
> country wide dataset contains hundreds or thousands of GML files.
>
> Even Rouault wrote:
>
> > Jukka,
> >
> > I'm very surprised that you need to do that explicitely, since the driver
> should
> > already do that by default. This is something we discovered last year or
> the year
> > before when you were experimenting OSM -> Spatialite conversions. Since OGR
> > 1.10, the spatial index is created when the datasource is closed.
> >
> > Look at this debug trace:
> >
> > $ ogr2ogr -f sqlite /vsimem/out.sqlite ../autotest/ogr/data/poly.shp -dsco
> > spatialite=yes --debug on
> > OGR: OGROpen(../autotest/ogr/data/poly.shp/0x69a3d0) succeeded as ESRI
> > Shapefile.
> > SQLITE: SpatiaLite v4 DB found !
> > OGR_SQLITE: exec(CREATE TABLE 'poly' (   OGC_FID INTEGER PRIMARY KEY))
> > OGR_SQLITE: exec(DELETE FROM geometry_columns WHERE f_table_name =
> > 'poly')
> > OGR_SQLITE: exec(SELECT AddGeometryColumn('poly', 'GEOMETRY', 325834,
> > 'POLYGON',
> > 2))
> > OGR_SQLITE: exec(ALTER TABLE 'poly' ADD COLUMN 'area' FLOAT)
> > OGR_SQLITE: exec(ALTER TABLE 'poly' ADD COLUMN 'eas_id' FLOAT)
> > OGR_SQLITE: exec(ALTER TABLE 'poly' ADD COLUMN 'prfedea' VARCHAR(16))
> > OGR_SQLITE: BEGIN Transaction
> > OGR_SQLITE: prepare(INSERT INTO 'poly'
> > ("GEOMETRY","area","eas_id","prfedea")
> > VALUES (?,?,?,?))
> > OGR_SQLITE: COMMIT Transaction
> > OGR2OGR: 10 features written in layer 'poly'
> > OGR_SQLITE: exec(SELECT CreateSpatialIndex('poly', 'GEOMETRY'))
> > SQLITE: Error no such table: layer_statistics
> > OGR: Unloading VirtualOGR module
> > Shape: 10 features read on layer 'poly'.
> >
> > Could try adding --debug on to your ogr2ogr command line and see when
> > CreateSpatialIndex() is created ?
> >
> > Even
> >
> > > Hi,
> > >
> > > I took timings about adding 115 GML files (548 MB together, 3.2
> > > million
> > > linestrings) into a Spatialite table. With default settings the table
> > > gets initialized with spatial index which makes following inserts
> > > slower. Another alternative is to create the table without spatial
> > > index, append all the data first and as a last step create spatial index
> for the
> > ready made table.
> > >
> > > With spatial index on:
> > > Append + index: 71 minutes
> > >
> > > With spatial index off:
> > > Append: 9 minutes
> > > Create spatial index: 6 minutes
> > > Total: 15 minutes
> > >
> > > I was rather happy with the initial conversion speed until I made this
> > > test which revealed that creating spatial index as a final step made
> > > the whole process more than four times faster! This way both data
> > > table and spatial index are probably in contiguous chunks in the
> > > SQLite datafile and there is no need for post process VACUUM.
> > > Vacuuming in SQLite is rather slow and for this 1.3 GB database it takes
> more
> > than 4 minutes to run.
> > >
> > > Suggestion: Add a new performance hint on page
> > > http://www.gdal.org/ogr/drv_sqlite.html
> > >
> > > "If many source files will be collected into the same Spatialite table
> > > it can be much faster to initialize the table without a spatial index
> > > by using -lco SPATIAL_INDEX=NO and to create spatial index with a
> > > separate command after all the data are appended. Spatial index can be
> > > created with ogrinfo command ogrinfo db.sqlite -sql "SELECT
> > > CreateSpatialIndex('table_name','geometry_column_name')"
> > >
> > > Perhaps it could also be mentioned as a performance hint that VACUUUM
> > > can also be done from orginfo as ogrinfo db.sqlite -sql "VACUUM"
> > >
> > > -Jukka Rahkonen-
> > >
> > >
> > >
> > > _______________________________________________
> > > gdal-dev mailing list
> > > gdal-dev at lists.osgeo.org
> > > http://lists.osgeo.org/mailman/listinfo/gdal-dev
> > >
> >
>
> _______________________________________________
> gdal-dev mailing list
> gdal-dev at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/gdal-dev
>




More information about the gdal-dev mailing list