<div dir="ltr">Well it's not the console I'm worried about, that's coming straight from the VS debugger. Knowing that strings are always coming out of PROJ in UTF-8 is good. <div><br></div><div>Ultimately I'm sending the output to a C# DLL, so I need to CoTaskMemAlloc my string. If I do something like this:</div><div><br></div><div>std::wstring s2ws(const char* utf8Bytes)<br>{<br>      const std::string& str(utf8Bytes);<br>        int size_needed = MultiByteToWideChar(CP_UTF8, 0, &str[0], (int)str.size(), NULL, 0);<br>     std::wstring wstrTo(size_needed, 0);<br>  MultiByteToWideChar(CP_UTF8, 0, &str[0], (int)str.size(), &wstrTo[0], size_needed);<br>   return wstrTo;<br>}</div><div><br></div><div>Then I see the corrected UTF-8 text in the wstring. As mentioned this isn't something I'm terribly familiar with, and I'd like to avoid writing terrible C code and exploding buffers.</div><div>CoTaskMemAlloc needs the actual number of bytes, and we'll need an extra spot for the null terminator. </div><div><br></div><div>const wchar_t* u_convertResult(const char* result) {<br>   if (!result)<br>          return nullptr;<br><br>     std::wstring wstr = s2ws(result);<br>     auto wlen = wstr.length() + 1;<br>        auto len = wlen * sizeof(wchar_t);<br>    wchar_t* buff = (wchar_t*)CoTaskMemAlloc(len);<br>        if (buff) {<br>           wcscpy_s(buff, wlen, wstr.c_str());<br>   }<br>     return buff;<br>}</div><div><br></div><div>Does this sound reasonable for Windows?</div><div><br></div><div>And as for Linux and maintaining a multi-platform compatibility, I'd define an alias function like this instead:</div><div>const wchar_t* u_convertResult(const char* result) {<br> std::string str(result);<br>      std::wstring wstr = std::wstring(str.begin(), str.end());<br><br>   auto wlen = wstr.length() + 1;<br>        auto len = wlen * sizeof(wchar_t);<br>    wchar_t* buff = (wchar_t*)malloc(len);<br>        if (buff) {<br>           wcscpy(buff, wstr.c_str());<br>   }<br>     return buff;<br>}</div><div><br></div><div>Since it's already happily working as UTF-8 on Linux, I should be able to pass in the original string to the wstring. CoTaskMemAlloc is just malloc. Does this sound okay too?</div><div><br></div><div>Thanks!<br><div><br></div></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, Apr 5, 2023 at 4:52 PM Even Rouault <<a href="mailto:even.rouault@spatialys.com">even.rouault@spatialys.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

  <div>

    <p>Peter,</p>

    <p>there isn't any issue in your build. It is just that PROJ returns

      UTF-8 encoded strings and that the typical Windows console isn't

      configured to display UTF-8. Cf

<a href="https://stackoverflow.com/questions/57131654/using-utf-8-encoding-chcp-65001-in-command-prompt-windows-powershell-window" target="_blank">https://stackoverflow.com/questions/57131654/using-utf-8-encoding-chcp-65001-in-command-prompt-windows-powershell-window</a>

      or similar issues</p>

    <p>Even<br>

    </p>

    <div>Le 05/04/2023 à 23:44, Peter Townsend

      via PROJ a écrit :<br>

    </div>

    <blockquote type="cite">

      <div dir="ltr">

        <div>I've got a bit of an annoyance with my windows proj build.

          Hopefully it's not too hard to resolve as the world of

          char/wchar_t/etc. isn't something I'm terribly familiar with.</div>

        <div><br>

        </div>

        <div>Take for example the area of use of EPSG:23031. On Linux

          it's fine, but on windows there's a unicode issue.</div>

        <div><br>

        </div>

        <div>PJ* crs = proj_create(m_ctxt, "EPSG:23031");<br>

          ASSERT_NE(crs, nullptr);<br>

          ObjectKeeper keeper_crsH(crs);<br>

          <br>

          double w, s, e, n;<br>

          const char* a;<br>

          proj_get_area_of_use(m_ctxt, crs, &w, &s, &e,

          &n, &a);<br>

        </div>

        <div><br>

        </div>

        <div>Contents of a:</div>

        "Europe - between 0Â°E and 6Â°E - Andorra; Denmark (North Sea);

        Germany offshore; Netherlands offshore; Norway including

        Svalbard - onshore and offshore; Spain - onshore (mainland and

        Balearic Islands); United Kingdom (UKCS) offshore."<br clear="all">

        <div><br>

        </div>

        <div>Is there a simple thing I'm overlooking in the build

          process that might clear up the encoding goof? Or do I need to

          do some bending over backwards with character manipulation?</div>

        <div><br>

        </div>

        <div>This is the command line I'm using to build this example:</div>

        <div>cmake -DBUILD_SHARED_LIBS=ON

          -DCMAKE_TOOLCHAIN_FILE=C:\dev\vcpkg\scripts\buildsystems\vcpkg.cmake

          ..<br>

          cmake --build . --config Debug -j 8<br>

        </div>

        <div><br>

        </div>

        <div>Thanks!</div>

        <span>-- </span><br>

        <div dir="ltr">

          <div dir="ltr">

            <div>Peter Townsend<br>

            </div>

            Senior Software Developer<br>

          </div>

        </div>

      </div>

      <br>

      <fieldset></fieldset>

      <pre>_______________________________________________

PROJ mailing list

<a href="mailto:PROJ@lists.osgeo.org" target="_blank">PROJ@lists.osgeo.org</a>

<a href="https://lists.osgeo.org/mailman/listinfo/proj" target="_blank">https://lists.osgeo.org/mailman/listinfo/proj</a>

</pre>

    </blockquote>

    <pre cols="72">-- 

<a href="http://www.spatialys.com" target="_blank">http://www.spatialys.com</a>

My software is free, but my time generally not.</pre>

  </div>

</blockquote></div><br clear="all"><div><br></div><span class="gmail_signature_prefix">-- </span><br><div dir="ltr" class="gmail_signature"><div dir="ltr"><div>Peter Townsend<br></div>Senior Software Developer<br></div></div>