[gdal-dev] ogr2ogr reprojection, features are not transformed
etourigny.dev at gmail.com
Thu Nov 17 18:09:39 EST 2011
On Wed, Nov 16, 2011 at 5:09 PM, Even Rouault
<even.rouault at mines-paris.org> wrote:
>> It seems that setting source srs is needed when using shapefiles, as
>> you said. This should be documented somewhere (probably on the
>> ogr2ogr page and/or shapefile driver page).
> Feel free to add a warning. Logically, this should be more in the shapefile
> driver page. But this assumes that people actually read docs, which is dubious
It would be nice to put it in the ogr2ogr page too, but I understand
you wouldn't want to put format-specific stuff in there.
I'll update the shapefile driver docs.
>> The more I use shapefiles the more I see the limitation in this file
>> format, and am quite puzzled as to why it is still so widespread...
> Yes shapefiles suffers from a lot of deficiencies (limitations of dbf format, no
> native - documented - spatial indexing, prj files, ...) You might experiment
> with spatialite which is far more capable, but still less widespread.
>> Any other ideas on how we can fix this?
>> Here is how I think it could be done:
>> 1- for all EPSG projections, generate its ESRI WKT (and perhaps a few
>> 2- make a mapping from ESRI WKT (or its hash) to EPSG codes
>> 3- use the hash mapping to find the EPSG code from a given WKT.
>> Does this make sense?
>> An obvious hurdle is that WKTs can have small variations.
>> For example,
>> EPSG:4618 as output by GDAL:
>> $ gdalsrsinfo -o wkt_esri EPSG:4618
>> whereas an example file (brazil.prj) has:
>> however, GDAL can deal with these variations:
>> $ gdalsrsinfo -o wkt_esri ESRI::brazil.prj
> The conversion between GDAL WKT and ESRI WKT belongs to the field of
> experimental science certainly. There are some known rules, but a lot of
> particular cases, some still remaining to be unearthed. The version of
> ogr_srs_esri.cpp in 1.8-esri branch is far more complicated than the one in
Is this going to be merged into trunk eventually?
> As far as your above algorithm is concerned, I'm wondering how it could work,
> with the variations you gave above. Perhaps a statistical approach with fuzzy
> string matching would give better results than something based on hashing ;-)
> More seriously, I think that a campaign of collecting a lot of .PRJ files
> (ideally coming from ESRI software, and not produced by GDAL) would be needed
> first to see which rules can work in practice.
I have been playing around a bit and here is what I did that works (first try):
- take a given CRS definition (from say EPSG or .prj file) and find
it's ESRI WKT or "simple" WKT.
- for all the EPSG codes in pcs.csv and gcs.csv, get it's ESRI (or
simple WKT), and compare that to the target WKT
- if you've a matching WKT, then get the full WKT corresponding to the
EPSG code that matches.
The problem is that it's pretty inefficient as you can imagine, taking
a few seconds to find one single target.
A second iteration:
- generate full WKT, ESRI WKT and "simple" (StripCT) WKT for all EPSG
codes in pcs.csv and gcs.csv
- save these to a flat (gzipped) file in csv form
- use these tables to find the EPSG code that matches a given WKT (in
whatever WKT flavor you need)
This is rather efficient in terms of processing time.
I thought that a hashing method could decrease the time to find a
matching string, but probably not because you have to load
the entire dataset anyway, and it doesn't make sense when you are scanning once.
This works for all EPSG codes I tried (think of it as a reverse EPSG
lookup), and also a few .prj files.
A problem I encountered was the differences in significant digits in
the ESRI-WKT and OGC-WKT, so for now it works best if warping to ESRI
I will file a bug about this, concerning the shapefile driver, and
also incorporate this into the gdalsrsinfo utility (with a new "EPSG"
Should I create a sandbox for an experimental gdalsrsinfo util
implementing this idea?
I found a few "fuzzy string" algorithms floating around, the idea is
not bad but could be expensive computationally. It could serve as a
backup if direct string matching fails.
> Another point to keep in mind is that the TOWGS84 parameters proposed by GDAL
> do not always make concensus. The GRASS developers are not particularly happy
> with that : they would prefer that a list of possible transformations would be
> proposed when EPSG lists several of them, instead of just one picked up. See
That's interesting also. So what is best, using the TOWS84 params
that GDAL chooses, or using none at all (as happens in this case)?
> Best regards,
More information about the gdal-dev