VS: [gdal-dev] Re: Fastest vector format for combining shapefiles

Rahkonen Jukka Jukka.Rahkonen at mmmtike.fi
Wed Oct 7 04:09:37 EDT 2009


Hi,

Maybe I need to clarify a bit my aim. This is huge dataset with hundreds of layers and we do not use actively all of them.  The layers we need we will insert to Oracle database but it is a managed, hosted production system and it is far too expensive to use it as a backyard storage shed.  What I need is a handy storege where I can easily take out the layers I need. A possibility to use a spatial window for the excerpt would be a nice benefit. Up till 4-6 gigabytes file size Spatialite seems to be about optimal solution.  It is a real database that supports queries but still all the data is stored in a one transferable file. It is much more complicated with, let's say Oracle or PostGIS, you can't just write the database on a CD or DVD and send it for your customer.

Most probably I will split the big layer to two or three chunks and write them to separate Spatialite files.

-Jukka-

Guillaume Sueur wrote:

> 
> Hi,
> 
> Interesting topic !
> The most efficient way is the one which will fit your needs the best.
> Forget the OneShape idea, but I think you can either have a 
> database approach (either PostGIS or Oracle) or a file 
> approach (your thousands of shapefiles).
> It depends on what you have to do with data and how you will 
> retrieve it. If you plan to do attribute queries, 
> classifications and filtering, go for a database, as it is a 
> database job to extract data fastly.
> If you display/draw/use at once all the content of your data, 
> the file approach will be the best. Note that you can 
> optimize Shapefiles by creating a spatial index on them 
> (shptree command) and a global index of your set (with 
> ogrtindex, look here 
> :http://mapserver.org/optimization/tileindex.html). It will 
> be much easier to handle with such a global file pointing to 
> your various files.
> You can even index that one with shptree too. 
> But it can be hard to manage when your data gets updated...
> 
> My 2 cents,
> 
> Guillaume
> 
> Le mercredi 07 octobre 2009 à 09:48 +0300, Rahkonen Jukka a écrit :
> > Jukka Rahkonen writes:
> > 
> > > 
> > > Hi,
> > > 
> > > I am combining some GIS data where each layer is divided to around
> > thousand
> > > separare shapefiles by mapsheets. Now I would like to 
> store all the
> > 35000
> > > shapefiles to something that is more easy to handle. At first 
> > > pushing
> > each layer
> > > to own Spatialite database feeled perfect, but I have 
> problems with
> > one layer
> > > which has rather lot of data. Appending shapefiles one by one to
> > Spatialite
> > > database gets too slow after the database file has 
> reached a size of
> > around 6
> > > gigabytes. Up till 3-4 gigabyte file size appending data to 
> > > Spatialite
> > is pretty
> > > fast and because it is a database I guess I will use that 
> for small
> > layers.  But
> > > what might be the fastest vector format that ogr supports 
> to collect
> > the big
> > > layer (thousand shapefiles with total size of about 10 gigabytes)
> > together?  I
> > > would prefer some file based format because data goes to long-time
> > storage, but
> > > I can use Oracle or PostGIS in between if it is faster to do the
> > conversion in
> > > two steps.  What is recommended? Shapefiles, MapInfo tab, Oracle,
> > PostGIS or
> > > something else?
> > 
> > I can tell now that shapefile format is not suitable at 
> all. The shp 
> > part can obviously not go over 2 GB limit because after 
> that ogr2ogr 
> > throws these error messages:
> > ERROR 1: Error in psSHP->sHooks.FSeek() or fwrite() writing 
> object to 
> > .shp file.
> > 
> > Dbf seems not to have such size limit because it grew up till 32 
> > gigabytes.
> > I will try MapInfo tab before I believe that it is just 
> best to keep 
> > the 1000 shapefiles or upload them all to PostGIS.
> > 
> > -Jukka-
> > _______________________________________________
> > gdal-dev mailing list
> > gdal-dev at lists.osgeo.org
> > http://lists.osgeo.org/mailman/listinfo/gdal-dev
> > 
> 
> 


More information about the gdal-dev mailing list