[gdal-dev] ORA-01406 error when reading many attributes from Oracle [SOLVED]

Jukka Rahkonen jukka.rahkonen at mmmtike.fi
Fri Nov 2 01:15:42 PDT 2012


Jukka Rahkonen <jukka.rahkonen <at> mmmtike.fi> writes:

> 
> Jukka Rahkonen <jukka.rahkonen <at> mmmtike.fi> writes:
> 
> > I suspect that the reason for the trouble is that this field is a
> > 17 character wide VARCHAR2 and I have in the data values like
> > "ÖVRE SÖDERGÄRDAN ". Database is using UTF-8 and Ö,Ö and Ä are
> > taking more than one byte each. Perhaps OCI driver develops come
> > from some ASCII country and did not bother to think about Oracle's
> > character and byte semantics throughly. It seems somehow fuzzy for
> > me even after reading this article
> > http://myorastuff.blogspot.fi/2009/02/character-and-byte-
> > semantics-in-oracle.html
> 
> I can repeat the error with a minimal one-row test table having a field
> NAME VARCHAR2(6)
> and value ÄäÖöÅå


I was following a wrong track and I think I need to add one more mail to t
his thread if someone happens to read it later.

I had discoved earlier with the trial and error method that I could write
"äöåÄÖÅ" characters into this Oracle database right with ogr2ogr by setting
the Windows environment variable as SET NLS_LANG=finnish_finland.utf-8

That made me think that it was the correct setting and good to be used 
also for reading data with ogr2ogr. That was not the case.

The correct NLS_LANG for the database is really "finnish_finland.ISO8859-P15"
By setting the environment to use that ogr2ogr is reading all the data from 
Oracle. However, I must do the character encoding conversion as another 
process because ogr2ogr cannot handle it.  

This means that I am totally happy because I have not found any way yet 
to make ogr2ogr to write the non-ASCII characters for example into Spatialite 
correctly. Direct writing leads to carbage in the Spatialite db. Somehow 
usable workaround is to write a temporary GML file first by using 
--config ORG_FORCE_ASCII=NO. By doing this the resulting GML file will 
be in ISO-8859-15. However, ogr2ogr writes always into the first line of 
GML files that the character encoding is UTF-8. That must be corrected 
by hand for making a valid GML file which can then be converted correctly 
into Spatialite. Not a big deal really but some of the resulting GML 
files are 2-3 GB in size and plain opening and saving the file takes 
some time. While wasting time for this I have been thinking about two options:
- For writing out GML user could have an option to set the character 
  encoding manually
- For reading GML user could give an option to treat the encoding as 
  something else than what the file itself is advertising.
However, I think that the real solution would be to have some common OGR 
wide way to handle character encodings and conversions instead of different 
implementations for each driver.

-Jukka Rahkonen-







More information about the gdal-dev mailing list