[gdal-dev] ORA-01406 error when reading many attributes from Oracle [SOLVED]
Jukka Rahkonen
jukka.rahkonen at mmmtike.fi
Fri Nov 2 01:15:42 PDT 2012
Jukka Rahkonen <jukka.rahkonen <at> mmmtike.fi> writes:
>
> Jukka Rahkonen <jukka.rahkonen <at> mmmtike.fi> writes:
>
> > I suspect that the reason for the trouble is that this field is a
> > 17 character wide VARCHAR2 and I have in the data values like
> > "ÖVRE SÖDERGÄRDAN ". Database is using UTF-8 and Ö,Ö and Ä are
> > taking more than one byte each. Perhaps OCI driver develops come
> > from some ASCII country and did not bother to think about Oracle's
> > character and byte semantics throughly. It seems somehow fuzzy for
> > me even after reading this article
> > http://myorastuff.blogspot.fi/2009/02/character-and-byte-
> > semantics-in-oracle.html
>
> I can repeat the error with a minimal one-row test table having a field
> NAME VARCHAR2(6)
> and value ÄäÖöÅå
I was following a wrong track and I think I need to add one more mail to t
his thread if someone happens to read it later.
I had discoved earlier with the trial and error method that I could write
"äöåÄÖÅ" characters into this Oracle database right with ogr2ogr by setting
the Windows environment variable as SET NLS_LANG=finnish_finland.utf-8
That made me think that it was the correct setting and good to be used
also for reading data with ogr2ogr. That was not the case.
The correct NLS_LANG for the database is really "finnish_finland.ISO8859-P15"
By setting the environment to use that ogr2ogr is reading all the data from
Oracle. However, I must do the character encoding conversion as another
process because ogr2ogr cannot handle it.
This means that I am totally happy because I have not found any way yet
to make ogr2ogr to write the non-ASCII characters for example into Spatialite
correctly. Direct writing leads to carbage in the Spatialite db. Somehow
usable workaround is to write a temporary GML file first by using
--config ORG_FORCE_ASCII=NO. By doing this the resulting GML file will
be in ISO-8859-15. However, ogr2ogr writes always into the first line of
GML files that the character encoding is UTF-8. That must be corrected
by hand for making a valid GML file which can then be converted correctly
into Spatialite. Not a big deal really but some of the resulting GML
files are 2-3 GB in size and plain opening and saving the file takes
some time. While wasting time for this I have been thinking about two options:
- For writing out GML user could have an option to set the character
encoding manually
- For reading GML user could give an option to treat the encoding as
something else than what the file itself is advertising.
However, I think that the real solution would be to have some common OGR
wide way to handle character encodings and conversions instead of different
implementations for each driver.
-Jukka Rahkonen-
More information about the gdal-dev
mailing list