[gdal-dev] Re: Fastest vector format for combining shapefiles

Guillaume Sueur no-reply at neogeo-online.net
Wed Oct 7 03:05:58 EDT 2009


Hi,

Interesting topic !
The most efficient way is the one which will fit your needs the best.
Forget the OneShape idea, but I think you can either have a database
approach (either PostGIS or Oracle) or a file approach (your thousands
of shapefiles).
It depends on what you have to do with data and how you will retrieve
it. If you plan to do attribute queries, classifications and filtering,
go for a database, as it is a database job to extract data fastly.
If you display/draw/use at once all the content of your data, the file
approach will be the best. Note that you can optimize Shapefiles by
creating a spatial index on them (shptree command) and a global index of
your set (with ogrtindex, look
here :http://mapserver.org/optimization/tileindex.html). It will be much
easier to handle with such a global file pointing to your various files.
You can even index that one with shptree too. 
But it can be hard to manage when your data gets updated...

My 2 cents,

Guillaume

Le mercredi 07 octobre 2009 à 09:48 +0300, Rahkonen Jukka a écrit :
> Jukka Rahkonen writes:
> 
> > 
> > Hi,
> > 
> > I am combining some GIS data where each layer is divided to around
> thousand
> > separare shapefiles by mapsheets. Now I would like to store all the
> 35000
> > shapefiles to something that is more easy to handle. At first pushing
> each layer
> > to own Spatialite database feeled perfect, but I have problems with
> one layer
> > which has rather lot of data. Appending shapefiles one by one to
> Spatialite
> > database gets too slow after the database file has reached a size of
> around 6
> > gigabytes. Up till 3-4 gigabyte file size appending data to Spatialite
> is pretty
> > fast and because it is a database I guess I will use that for small
> layers.  But
> > what might be the fastest vector format that ogr supports to collect
> the big
> > layer (thousand shapefiles with total size of about 10 gigabytes)
> together?  I
> > would prefer some file based format because data goes to long-time
> storage, but
> > I can use Oracle or PostGIS in between if it is faster to do the
> conversion in
> > two steps.  What is recommended? Shapefiles, MapInfo tab, Oracle,
> PostGIS or
> > something else?
> 
> I can tell now that shapefile format is not suitable at all. The shp
> part can 
> obviously not go over 2 GB limit because after that ogr2ogr throws 
> these error messages:
> ERROR 1: Error in psSHP->sHooks.FSeek() or fwrite() writing object to
> .shp file.  
> 
> Dbf seems not to have such size limit because it grew up till 32
> gigabytes.
> I will try MapInfo tab before I believe that it is just best to keep the
> 1000
> shapefiles or upload them all to PostGIS.
> 
> -Jukka-
> _______________________________________________
> gdal-dev mailing list
> gdal-dev at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/gdal-dev
> 



More information about the gdal-dev mailing list