[PROJ] PROJ and Unicode on Windows

Dan Crosby dan.crosby at lincolnagritech.co.nz
Wed Apr 5 16:09:56 PDT 2023


How does this work on Linux? is char define as wchar there?
 
If Proj is returning UTF8 strings, shouldn’t the functions be using wchar, or TCHAR at the least?
 
Is there a compatibility reason to use char **?
 
From: PROJ <proj-bounces at lists.osgeo.org> On Behalf Of Peter Townsend via PROJ
Sent: Thursday, 6 April 2023 10:38
To: Even Rouault <even.rouault at spatialys.com>
Cc: proj <proj at lists.osgeo.org>
Subject: Re: [PROJ] PROJ and Unicode on Windows
 
Well it's not the console I'm worried about, that's coming straight from the VS debugger. Knowing that strings are always coming out of PROJ in UTF-8 is good. 
 
Ultimately I'm sending the output to a C# DLL, so I need to CoTaskMemAlloc my string. If I do something like this:
 
std::wstring s2ws(const char* utf8Bytes)
{
const std::string& str(utf8Bytes);
int size_needed = MultiByteToWideChar(CP_UTF8, 0, &str[0], (int)str.size(), NULL, 0);
std::wstring wstrTo(size_needed, 0);
MultiByteToWideChar(CP_UTF8, 0, &str[0], (int)str.size(), &wstrTo[0], size_needed);
return wstrTo;
}
 
Then I see the corrected UTF-8 text in the wstring. As mentioned this isn't something I'm terribly familiar with, and I'd like to avoid writing terrible C code and exploding buffers.
CoTaskMemAlloc needs the actual number of bytes, and we'll need an extra spot for the null terminator. 
 
const wchar_t* u_convertResult(const char* result) {
if (!result)
return nullptr;

std::wstring wstr = s2ws(result);
auto wlen = wstr.length() + 1;
auto len = wlen * sizeof(wchar_t);
wchar_t* buff = (wchar_t*)CoTaskMemAlloc(len);
if (buff) {
wcscpy_s(buff, wlen, wstr.c_str());
}
return buff;
}
 
Does this sound reasonable for Windows?
 
And as for Linux and maintaining a multi-platform compatibility, I'd define an alias function like this instead:
const wchar_t* u_convertResult(const char* result) {
std::string str(result);
std::wstring wstr = std::wstring(str.begin(), str.end());

auto wlen = wstr.length() + 1;
auto len = wlen * sizeof(wchar_t);
wchar_t* buff = (wchar_t*)malloc(len);
if (buff) {
wcscpy(buff, wstr.c_str());
}
return buff;
}
 
Since it's already happily working as UTF-8 on Linux, I should be able to pass in the original string to the wstring. CoTaskMemAlloc is just malloc. Does this sound okay too?
 
Thanks!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/proj/attachments/20230406/6d070271/attachment-0001.htm>


More information about the PROJ mailing list