[postgis-users] Re: [Plr-general] Tutorial on PLR and PostGIS, more on carriage returns
Paul Ramsey
pramsey at refractions.net
Thu Jun 21 14:50:02 PDT 2007
Steve,
You're right and I'm wrong, I was confused by the UTF code numbers,
which differ from the actual byte encodings used for UTF8. Indeed,
all the multi-byte higher-order stuff is stuffed into 128-255 in the
UTF8 encoding, so a straight byte-swap would work (for UTF8 and the
various one-byte latin code pages, that is).
Paul
On 21-Jun-07, at 10:30 AM, Stephen Woodbridge wrote:
> Hmmmm, I am probably wrong on this but I thought 0x0 - 0x7f are
> standard UTF8 characters with a constant meaning that is the same
> as ascii for those bytes, and the all multi-byte characters had to
> have a the highorder bit set to indicate is was part of a multibyte
> sequence.
>
> I was not under the impresion that at you could have 0x0 - 0x7f as
> a part of a multi-byte sequence. I am not an expert in this area
> and probably just know enough to mislead you ;) but I think it is
> worthwhile getting some additional inside into this. I for one
> would like to see a multi-byte UTF8 sequence with \r embedded in it.
>
> -Steve
>
>
> Paul Ramsey wrote:
>> Danger, will Robinson. All values are fair game in bytes 2,3,4 of
>> the UTF encodings, so yes, it's possible you'll wreck multi-byte
>> characters by doing a simple replacement on the byte array.
>> Better to use an encoding-aware string replace function (not
>> knowing C, I don't know what that would be, but there must be some
>> in the PgSQL code base).
>> P
>> On 21-Jun-07, at 7:03 AM, Joe Conway wrote:
>>> Obe, Regina wrote:
>>>> Joe,
>>>> Can you take a look at it again. It was messed up in my
>>>> firefox too. I think originally I had it looking right in
>>>> Firefox, but then IE it didn't look right so I changed it to
>>>> look right in IE, but forgot to check back in firefox.
>>>> Hopefully this time I have made all browser masters happy.
>>>
>>> http://www.bostongis.com/PrinterFriendly.aspx?
>>> content_name=postgresql_plr_tut02
>>> The tutorial looks perfect now in Firefox on Fedora Core 7.
>>>
>>> BTW, I have confirmed on the R-devel list that the R engine is
>>> expecting \n for EOL, and \r will cause a syntax error, on all
>>> platforms. I will probably fix this by simply replacing \r with
>>> \n in PL/R functions. My only reservation is whether this might
>>> cause issues for installations with multibyte characters. Does
>>> anyone know if it is possible for multibyte characters to include
>>> a byte = 13 (\r), i.e. is the simple replacement of \r safe in
>>> all locales?
>>>
>>> Thanks,
>>>
>>> Joe
>>>
>>> _______________________________________________
>>> postgis-users mailing list
>>> postgis-users at postgis.refractions.net
>>> http://postgis.refractions.net/mailman/listinfo/postgis-users
>> _______________________________________________
>> postgis-users mailing list
>> postgis-users at postgis.refractions.net
>> http://postgis.refractions.net/mailman/listinfo/postgis-users
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
More information about the postgis-users
mailing list