[postgis-users] Shp2pgsql : Error in fread()

Sun May 26 13:06:36 PDT 2013

Hello,

Thanks for your answer.

Your advices to use ogr2ogr would have been option I will keep in mind for future cases. Presently, I chose to go back in the process and to build data within postgis instead of importing it.

Work in progess... updates take some times when they apply to millions of rows !

Hugues.

-----Message d'origine-----
De : postgis-users-bounces at lists.osgeo.org [mailto:postgis-users-bounces at lists.osgeo.org] De la part de Donovan Cameron
Envoyé : jeudi 23 mai 2013 20:18
À : PostGIS Users Discussion
Objet : Re: [postgis-users] Shp2pgsql : Error in fread()

Hey Hugues,

I was recently trying to load large shapefiles and a filegdb into postgis 2.0 too.

I had a +170 million point file (filegdb) and a +1 million record polygon (shapefile).

Using shp2pgsql would get so far and crash for me as well with similar errors.

I found that I could instead make use of the pgdump
(http://www.gdal.org/ogr/drv_pgdump.html) driver from gdal!
I also used some other parameters and lco to make it work smoothly.

ogr2ogr --config PG_USE_COPY YES -gt 65536 -f PGDump output.sql input.shp -lco GEOMETRY_NAME=geom -lco SCHEMA=schema -lco CREATE_SCHEMA=OFF -lco SPATIAL_INDEX=OFF

Instead of making a .sql file, the bottom of the drv_pgdump page has a nice example on how to pipe the output directly into psql if you want to try that.

# use COPY instead of INSERT commands (speeds it up) PG_USE_COPY YES # amount of records to insert on each iteration -gt 65536 # name the output geom field "geom" instead of "wkb_geometry", replace if needed GEOMETRY_NAME=geom # update schema with your actual output schema SCHEMA=schema # do not create the schema (it probably exists already! Omit this option if the schema needs to be created) CREATE_SCHEMA=OFF # don't create the GIST index on the geom field, speeds up process significantly (we do this manually after!) SPATIAL_INDEX=OFF

Once the sql file is made, you can then use ogr2ogr again to import that to your postgis database or just do the pipe trick instead of making another output file (save disk space and time!)

I also found that logging into the database via the psql terminal would work just as well, also gets you to the place needed to run the GiST index creation afterwards.

In psql, once connected to the desired database, use the \i command:
\i C:/Path/to/output.sql

But you will then have to create the GIST index on the geom field from psql or pgadmin with a query like this:

CREATE INDEX tablename_geom_gist
  ON schema.tablename
  USING gist
  (geom);

Replace geom, schema and tablename as needed.
Setup the needed read/write permissions and you are good to go!

Donovan

On Tue, May 21, 2013 at 12:43 PM, Hugues François <hugues.francois at irstea.fr> wrote:
> Hello,
>
> Thanks for your answer. It took me some times to reply because I tried to make test with spliting the shape file : it takes quite a long time to work with million of polygons.
>
> If I'm not wrong, the ticket you linked to is about dbf size and has been fixed in 1.5.3.  On my side, shp is effectively more than 2 Go but dbf file is near 1.5 Go and I work with postgis 2.0.1.
>
> However, I tried to select  the first  10.000.000'th rows and then the last ones within arcgis to export data as a new shapefile. Despite there was no error when I exported the data (or when I reopen it), this time I wasn't able to export the second part of the table further than I could import in postgis.
>
> Finally, I don't know where the problem is exactly but I think it's on the arcgis side.
>
> Hugues.
_______________________________________________
postgis-users mailing list
postgis-users at lists.osgeo.org
http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users