<div dir="ltr">Well it's not the console I'm worried about, that's coming straight from the VS debugger. Knowing that strings are always coming out of PROJ in UTF-8 is good. <div><br></div><div>Ultimately I'm sending the output to a C# DLL, so I need to CoTaskMemAlloc my string. If I do something like this:</div><div><br></div><div>std::wstring s2ws(const char* utf8Bytes)<br>{<br> const std::string& str(utf8Bytes);<br> int size_needed = MultiByteToWideChar(CP_UTF8, 0, &str[0], (int)str.size(), NULL, 0);<br> std::wstring wstrTo(size_needed, 0);<br> MultiByteToWideChar(CP_UTF8, 0, &str[0], (int)str.size(), &wstrTo[0], size_needed);<br> return wstrTo;<br>}</div><div><br></div><div>Then I see the corrected UTF-8 text in the wstring. As mentioned this isn't something I'm terribly familiar with, and I'd like to avoid writing terrible C code and exploding buffers.</div><div>CoTaskMemAlloc needs the actual number of bytes, and we'll need an extra spot for the null terminator. </div><div><br></div><div>const wchar_t* u_convertResult(const char* result) {<br> if (!result)<br> return nullptr;<br><br> std::wstring wstr = s2ws(result);<br> auto wlen = wstr.length() + 1;<br> auto len = wlen * sizeof(wchar_t);<br> wchar_t* buff = (wchar_t*)CoTaskMemAlloc(len);<br> if (buff) {<br> wcscpy_s(buff, wlen, wstr.c_str());<br> }<br> return buff;<br>}</div><div><br></div><div>Does this sound reasonable for Windows?</div><div><br></div><div>And as for Linux and maintaining a multi-platform compatibility, I'd define an alias function like this instead:</div><div>const wchar_t* u_convertResult(const char* result) {<br> std::string str(result);<br> std::wstring wstr = std::wstring(str.begin(), str.end());<br><br> auto wlen = wstr.length() + 1;<br> auto len = wlen * sizeof(wchar_t);<br> wchar_t* buff = (wchar_t*)malloc(len);<br> if (buff) {<br> wcscpy(buff, wstr.c_str());<br> }<br> return buff;<br>}</div><div><br></div><div>Since it's already happily working as UTF-8 on Linux, I should be able to pass in the original string to the wstring. CoTaskMemAlloc is just malloc. Does this sound okay too?</div><div><br></div><div>Thanks!<br><div><br></div></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, Apr 5, 2023 at 4:52 PM Even Rouault <<a href="mailto:even.rouault@spatialys.com">even.rouault@spatialys.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div>
<p>Peter,</p>
<p>there isn't any issue in your build. It is just that PROJ returns
UTF-8 encoded strings and that the typical Windows console isn't
configured to display UTF-8. Cf
<a href="https://stackoverflow.com/questions/57131654/using-utf-8-encoding-chcp-65001-in-command-prompt-windows-powershell-window" target="_blank">https://stackoverflow.com/questions/57131654/using-utf-8-encoding-chcp-65001-in-command-prompt-windows-powershell-window</a>
or similar issues</p>
<p>Even<br>
</p>
<div>Le 05/04/2023 à 23:44, Peter Townsend
via PROJ a écrit :<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div>I've got a bit of an annoyance with my windows proj build.
Hopefully it's not too hard to resolve as the world of
char/wchar_t/etc. isn't something I'm terribly familiar with.</div>
<div><br>
</div>
<div>Take for example the area of use of EPSG:23031. On Linux
it's fine, but on windows there's a unicode issue.</div>
<div><br>
</div>
<div>PJ* crs = proj_create(m_ctxt, "EPSG:23031");<br>
ASSERT_NE(crs, nullptr);<br>
ObjectKeeper keeper_crsH(crs);<br>
<br>
double w, s, e, n;<br>
const char* a;<br>
proj_get_area_of_use(m_ctxt, crs, &w, &s, &e,
&n, &a);<br>
</div>
<div><br>
</div>
<div>Contents of a:</div>
"Europe - between 0°E and 6°E - Andorra; Denmark (North Sea);
Germany offshore; Netherlands offshore; Norway including
Svalbard - onshore and offshore; Spain - onshore (mainland and
Balearic Islands); United Kingdom (UKCS) offshore."<br clear="all">
<div><br>
</div>
<div>Is there a simple thing I'm overlooking in the build
process that might clear up the encoding goof? Or do I need to
do some bending over backwards with character manipulation?</div>
<div><br>
</div>
<div>This is the command line I'm using to build this example:</div>
<div>cmake -DBUILD_SHARED_LIBS=ON
-DCMAKE_TOOLCHAIN_FILE=C:\dev\vcpkg\scripts\buildsystems\vcpkg.cmake
..<br>
cmake --build . --config Debug -j 8<br>
</div>
<div><br>
</div>
<div>Thanks!</div>
<span>-- </span><br>
<div dir="ltr">
<div dir="ltr">
<div>Peter Townsend<br>
</div>
Senior Software Developer<br>
</div>
</div>
</div>
<br>
<fieldset></fieldset>
<pre>_______________________________________________
PROJ mailing list
<a href="mailto:PROJ@lists.osgeo.org" target="_blank">PROJ@lists.osgeo.org</a>
<a href="https://lists.osgeo.org/mailman/listinfo/proj" target="_blank">https://lists.osgeo.org/mailman/listinfo/proj</a>
</pre>
</blockquote>
<pre cols="72">--
<a href="http://www.spatialys.com" target="_blank">http://www.spatialys.com</a>
My software is free, but my time generally not.</pre>
</div>
</blockquote></div><br clear="all"><div><br></div><span class="gmail_signature_prefix">-- </span><br><div dir="ltr" class="gmail_signature"><div dir="ltr"><div>Peter Townsend<br></div>Senior Software Developer<br></div></div>