[postgis-devel] shp2pgsql transactions
Michael Orlitzky
michael at orlitzky.com
Sun Oct 18 19:56:04 PDT 2009
I see that the shp2pgsql utility is adding END/BEGIN transaction
delimiters once for every 250 INSERT statements.
I am attempting to import the TIGER/Line road data, and have noticed
that the line identifiers (tlid) are duplicated across county
boundaries. The result is that some roads and their associated
geometries are present in the database multiple times. I imagine this
will cause problems in the future for e.g. k-shortest path, and so would
like to eliminate the duplicates. I see two options:
1 Find and eliminate the duplicates in the DB. Would be terribly slow
with enough data.
2 Prevent the duplicates from being inserted with a unique index. Also
slow, but better than the first option.
Of these, the second seems more desirable. But, to do so, I would need
to insert the rows one at a time outside of a transaction. Right now,
I'm simply filtering the shp2pgsql output with sed. This works, but is
slower than necessary.
Would there be interest in a feature request or patch to make the
transactions optional?
More information about the postgis-devel
mailing list