<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
  <body>
    <p><br>
    </p>
    <div class="moz-cite-prefix">Le 06/04/2023 à 01:09, Dan Crosby a
      écrit :<br>
    </div>
    <blockquote type="cite"
      cite="mid:cdca058a-7f03-454f-83c0-8f0851555e14@lincolnagritech.co.nz">
      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
      <meta name="Generator" content="Microsoft Word 15 (filtered
        medium)">
      <style>@font-face
        {font-family:"Cambria Math";
        panose-1:2 4 5 3 5 4 6 3 2 4;}@font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}@font-face
        {font-family:Consolas;
        panose-1:2 11 6 9 2 2 4 3 2 4;}p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0cm;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:"Times New Roman",serif;}a:link, span.MsoHyperlink
        {mso-style-priority:99;
        color:blue;
        text-decoration:underline;}a:visited, span.MsoHyperlinkFollowed
        {mso-style-priority:99;
        color:purple;
        text-decoration:underline;}p
        {mso-style-priority:99;
        mso-margin-top-alt:auto;
        margin-right:0cm;
        mso-margin-bottom-alt:auto;
        margin-left:0cm;
        font-size:12.0pt;
        font-family:"Times New Roman",serif;}pre
        {mso-style-priority:99;
        mso-style-link:"HTML Preformatted Char";
        margin:0cm;
        margin-bottom:.0001pt;
        font-size:10.0pt;
        font-family:"Courier New";}p.msonormal0, li.msonormal0, div.msonormal0
        {mso-style-name:msonormal;
        mso-margin-top-alt:auto;
        margin-right:0cm;
        mso-margin-bottom-alt:auto;
        margin-left:0cm;
        font-size:12.0pt;
        font-family:"Times New Roman",serif;}span.HTMLPreformattedChar
        {mso-style-name:"HTML Preformatted Char";
        mso-style-priority:99;
        mso-style-link:"HTML Preformatted";
        font-family:Consolas;
        mso-fareast-language:EN-NZ;}span.gmailsignatureprefix
        {mso-style-name:gmail_signature_prefix;}span.EmailStyle22
        {mso-style-type:personal;
        font-family:"Calibri",sans-serif;
        color:#1F497D;}span.EmailStyle23
        {mso-style-type:personal-compose;
        font-family:"Calibri",sans-serif;
        color:windowtext;}.MsoChpDefault
        {mso-style-type:export-only;
        font-family:"Calibri",sans-serif;
        mso-fareast-language:EN-US;}div.WordSection1
        {page:WordSection1;}</style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
      <div class="WordSection1">
        <p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D;mso-fareast-language:EN-US">How
            does this work on Linux? is char define as wchar there?</span></p>
      </div>
    </blockquote>
    No. char is a single byte. wchar_t is generally a 32-bit integer on
    Unix.<br>
    <blockquote type="cite"
      cite="mid:cdca058a-7f03-454f-83c0-8f0851555e14@lincolnagritech.co.nz">
      <div class="WordSection1">
        <p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D;mso-fareast-language:EN-US"><o:p></o:p></span></p>
        <p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
        <p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D;mso-fareast-language:EN-US">If
            Proj is returning UTF8 strings, shouldn’t the functions be
            using wchar, or TCHAR at the least?</span></p>
      </div>
    </blockquote>
    I guess this is just a matter of taste/habit. Lots of open source
    libraries that return Unicode content just return it as UTF-8 in a
    char* (or a std::string in C++. this is typically the case of the
    nlohmann/json library we use for JSON parsing).  If you need to
    access the string by Unicode character, you can use iconv or
    <a class="moz-txt-link-freetext" href="https://en.cppreference.com/w/cpp/locale/codecvt_utf8">https://en.cppreference.com/w/cpp/locale/codecvt_utf8</a> in C++
    (although the latter has been deprecated).<br>
    <blockquote type="cite"
      cite="mid:cdca058a-7f03-454f-83c0-8f0851555e14@lincolnagritech.co.nz">
      <div class="WordSection1">
        <p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D;mso-fareast-language:EN-US"><o:p></o:p></span></p>
        <p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
        <p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D;mso-fareast-language:EN-US">Is
            there a compatibility reason to use char **?</span></p>
      </div>
    </blockquote>
    <p>That's all the reason why UTF-8 was designed for. To be able to
      deal with it mostly as if it was an old-school ASCII string.<br>
    </p>
    <p><br>
    </p>
    <blockquote type="cite"
      cite="mid:cdca058a-7f03-454f-83c0-8f0851555e14@lincolnagritech.co.nz">
      <div class="WordSection1">
        <p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D;mso-fareast-language:EN-US"><o:p></o:p></span></p>
        <p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
        <div style="border:none;border-left:solid blue 1.5pt;padding:0cm
          0cm 0cm 4.0pt">
          <div>
            <div style="border:none;border-top:solid #E1E1E1
              1.0pt;padding:3.0pt 0cm 0cm 0cm">
              <p class="MsoNormal"><b><span
                    style="font-size:11.0pt;font-family:"Calibri",sans-serif"
                    lang="EN-US">From:</span></b><span
                  style="font-size:11.0pt;font-family:"Calibri",sans-serif"
                  lang="EN-US"> PROJ
                  <a class="moz-txt-link-rfc2396E" href="mailto:proj-bounces@lists.osgeo.org"><proj-bounces@lists.osgeo.org></a> <b>On Behalf Of
                  </b>Peter Townsend via PROJ<br>
                  <b>Sent:</b> Thursday, 6 April 2023 10:38<br>
                  <b>To:</b> Even Rouault
                  <a class="moz-txt-link-rfc2396E" href="mailto:even.rouault@spatialys.com"><even.rouault@spatialys.com></a><br>
                  <b>Cc:</b> proj <a class="moz-txt-link-rfc2396E" href="mailto:proj@lists.osgeo.org"><proj@lists.osgeo.org></a><br>
                  <b>Subject:</b> Re: [PROJ] PROJ and Unicode on Windows<o:p></o:p></span></p>
            </div>
          </div>
          <p class="MsoNormal"><o:p> </o:p></p>
          <div>
            <p class="MsoNormal">Well it's not the console I'm worried
              about, that's coming straight from the VS debugger.
              Knowing that strings are always coming out of PROJ in
              UTF-8 is good. <o:p></o:p></p>
            <div>
              <p class="MsoNormal"><o:p> </o:p></p>
            </div>
            <div>
              <p class="MsoNormal">Ultimately I'm sending the output to
                a C# DLL, so I need to CoTaskMemAlloc my string. If I do
                something like this:<o:p></o:p></p>
            </div>
            <div>
              <p class="MsoNormal"><o:p> </o:p></p>
            </div>
            <div>
              <p class="MsoNormal">std::wstring s2ws(const char*
                utf8Bytes)<br>
                {<br>
                const std::string& str(utf8Bytes);<br>
                int size_needed = MultiByteToWideChar(CP_UTF8, 0,
                &str[0], (int)str.size(), NULL, 0);<br>
                std::wstring wstrTo(size_needed, 0);<br>
                MultiByteToWideChar(CP_UTF8, 0, &str[0],
                (int)str.size(), &wstrTo[0], size_needed);<br>
                return wstrTo;<br>
                }<o:p></o:p></p>
            </div>
            <div>
              <p class="MsoNormal"><o:p> </o:p></p>
            </div>
            <div>
              <p class="MsoNormal">Then I see the corrected UTF-8 text
                in the wstring. As mentioned this isn't something I'm
                terribly familiar with, and I'd like to avoid writing
                terrible C code and exploding buffers.<o:p></o:p></p>
            </div>
            <div>
              <p class="MsoNormal">CoTaskMemAlloc needs the actual
                number of bytes, and we'll need an extra spot for the
                null terminator. <o:p></o:p></p>
            </div>
            <div>
              <p class="MsoNormal"><o:p> </o:p></p>
            </div>
            <div>
              <p class="MsoNormal">const wchar_t* u_convertResult(const
                char* result) {<br>
                if (!result)<br>
                return nullptr;<br>
                <br>
                std::wstring wstr = s2ws(result);<br>
                auto wlen = wstr.length() + 1;<br>
                auto len = wlen * sizeof(wchar_t);<br>
                wchar_t* buff = (wchar_t*)CoTaskMemAlloc(len);<br>
                if (buff) {<br>
                wcscpy_s(buff, wlen, wstr.c_str());<br>
                }<br>
                return buff;<br>
                }<o:p></o:p></p>
            </div>
            <div>
              <p class="MsoNormal"><o:p> </o:p></p>
            </div>
            <div>
              <p class="MsoNormal">Does this sound reasonable for
                Windows?<o:p></o:p></p>
            </div>
            <div>
              <p class="MsoNormal"><o:p> </o:p></p>
            </div>
            <div>
              <p class="MsoNormal">And as for Linux and maintaining a
                multi-platform compatibility, I'd define an alias
                function like this instead:<o:p></o:p></p>
            </div>
            <div>
              <p class="MsoNormal">const wchar_t* u_convertResult(const
                char* result) {<br>
                std::string str(result);<br>
                std::wstring wstr = std::wstring(str.begin(),
                str.end());<br>
                <br>
                auto wlen = wstr.length() + 1;<br>
                auto len = wlen * sizeof(wchar_t);<br>
                wchar_t* buff = (wchar_t*)malloc(len);<br>
                if (buff) {<br>
                wcscpy(buff, wstr.c_str());<br>
                }<br>
                return buff;<br>
                }<o:p></o:p></p>
            </div>
            <div>
              <p class="MsoNormal"><o:p> </o:p></p>
            </div>
            <div>
              <p class="MsoNormal">Since it's already happily working as
                UTF-8 on Linux, I should be able to pass in the original
                string to the wstring. CoTaskMemAlloc is just malloc.
                Does this sound okay too?<o:p></o:p></p>
            </div>
            <div>
              <p class="MsoNormal"><o:p> </o:p></p>
            </div>
            <div>
              <p class="MsoNormal">Thanks!<o:p></o:p></p>
              <div>
                <p class="MsoNormal"><o:p> </o:p></p>
              </div>
            </div>
          </div>
        </div>
      </div>
      <br>
      <fieldset class="moz-mime-attachment-header"></fieldset>
      <pre class="moz-quote-pre" wrap="">_______________________________________________
PROJ mailing list
<a class="moz-txt-link-abbreviated" href="mailto:PROJ@lists.osgeo.org">PROJ@lists.osgeo.org</a>
<a class="moz-txt-link-freetext" href="https://lists.osgeo.org/mailman/listinfo/proj">https://lists.osgeo.org/mailman/listinfo/proj</a>
</pre>
    </blockquote>
    <pre class="moz-signature" cols="72">-- 
<a class="moz-txt-link-freetext" href="http://www.spatialys.com">http://www.spatialys.com</a>
My software is free, but my time generally not.</pre>
  </body>
</html>