Issues with SDE and Unicode
Russell de Grove
russell at GOISC.COM
Tue Feb 20 11:48:57 PST 2007
I have map layers in ArcSDE on Sql Server 2005 and I have been trying to
label features from a field with Unicode data (type nvarchar).
To get around the ""Unknown SDE column type" error I had to add the
following to the sdeGetRecord method in mapsde.c, in the "switch(itemdefs
[i].sde_type)" block:
#ifdef SE_NSTRING_TYPE
case SE_NSTRING_TYPE:
shape->values[i] = (char *)malloc( (itemdefs[i].size + 1) * sizeof
(unsigned short));
status = SE_stream_get_nstring(sde->stream,
(short) (i+1),
(unsigned short *)shape->values[i]);
if(status == SE_NULL_VALUE)
((unsigned short *)shape->values[i])[0] = (unsigned short)0; /* empty
string */
else if(status != SE_SUCCESS) {
sde_error(status, "sdeGetRecord()", "SE_stream_get_nstring()");
return(MS_FAILURE);
}
break;
#endif
So far, so good, but I only see the first character of each label. If I explicitly
include a Unicode "preamble", I see two garbage characters followed by the
first expected characters. As it happens, my data is in UTF-16 and my
characters are all ASCII-type characters that use only the low byte. I believe
what is causing my problem is the "msGetEncodedString" method in mapgd.c.
char *msGetEncodedString(const char *string, const char *encoding)
{
#ifdef USE_ICONV
iconv_t cd = NULL;
char *in, *inp;
char *outp, *out = NULL;
size_t len, bufsize, bufleft, status;
cd = iconv_open("UTF-8", encoding);
if(cd == (iconv_t)-1) {
msSetError(MS_IDENTERR, "Encoding not supported by libiconv (%s).",
"msGetEncodedString()", encoding);
return NULL;
}
len = strlen(string);
// Problem point: strlen will return the count up to the first null byte,
so "Shape #0" as Unicode will return 1 for the S stored little-endian, or 3 if a
Unicode "preamble" is used
bufsize = len * 4;
in = strdup(string);
inp = in;
out = (char*) malloc(bufsize);
if(in == NULL || out == NULL){
msSetError(MS_MEMERR, NULL, "msGetEncodedString()");
msFree(in);
iconv_close(cd);
return NULL;
}
strcpy(out, in);
outp = out;
bufleft = bufsize;
status = -1;
while (len > 0){
status = iconv(cd, (const char**)&inp, &len, &outp, &bufleft);
// Problem point: since this expects byte pairs, a byte length of 1 or 3 is going
to cause problems.
if(status == -1){
msFree(in);
msFree(out);
iconv_close(cd);
return strdup(string);
// Problem point: since there was a problem, strdup returns the original "string"
up to the first null byte... so I get "S", possibly with a couple of preceding
garbage characters if I used a preamble
}
}
out[bufsize - bufleft] = '\0';
msFree(in);
iconv_close(cd);
return out;
#else
msSetError(MS_MISCERR, "Not implemeted since Iconv is not enabled.",
"msGetEncodedString()");
return NULL;
#endif
}
Has anyone else encountered similar problems? Does anyone know how I can
determine the correct width of characters based on the "encoding" parameter?
More information about the MapServer-users
mailing list