<div dir="ltr">I got it to work finally. Here's what I did for posterity.<div><br></div><div>What I've created is a C# managed wrapper, and I added a bunch of C functions to PROJ with some extra functionality. Part of the extra functionality is to CoTaskMemAlloc all the strings that PROJ can return so that the .NET world will be happy.</div><div>For example, something like this:</div><div><br>const char* proj_errno_string_proxy(int err) {<br> return convertResult(proj_errno_string(err));<br>}<br></div><div>...</div><div>const char* convertResult(const char* result) {<br> if (!result)<br> return result;<br><br> //Going out, we need to CoTaskMemAlloc to pin the string memory.<br> //MarshalAs LPStr will free it automatically in .NET world.<br> std::string str(result);<br> auto len = str.length() + 1;<br> char* buff = (char*)CoTaskMemAlloc(len);<br> if (buff) {<br> ml_strcpy(buff, len, str);<br> }<br> return buff;<br>}</div><div><br></div><div>The PInvoke signature is:</div><div>public const CharSet CHARSET = CharSet.Ansi;<br>public const CallingConvention CALLING_CONVENTION = CallingConvention.Cdecl;<br>public const UnmanagedType STRINGTYPE = UnmanagedType.LPStr;<br></div><div>...</div><div>[DllImport(PROJ_PROXY_DLL, EntryPoint = "proj_errno_string_proxy", CallingConvention = CALLING_CONVENTION)]<br> [return: MarshalAs(STRINGTYPE)]<br> public extern static string proj_errno_string_proxy(int err);<br></div><div><br></div><div>(The built-in marshalling will take care of freeing what I've sent it.)</div><div><br></div><div>Because PROJ is returning UTF-8 strings, this means that my strings aren't coming or going in the right encoding. It's been working fine though in practice, but sometimes you'd see the unicode garbling here and there.</div><div><br></div><div>Here's what I did to fix it. I was originally targeting .NET standard 2.0 and 2.1. 2.1 added UnmanagedType.LPUTF8Str from .NET Framework 4.7. And it "just works". Changing STRINGTYPE to LPUTF8Str makes those parameters (and struct members) encode correctly. I had to drop Standard 2.0 though.</div><div><br>Except in the case of string arrays, like those const char** options parameters. My PInvoke signature was this:</div><div>public const UnmanagedType STRINGTYPE = UnmanagedType.LPStr;<br> public const UnmanagedType ARRAYTYPE = UnmanagedType.LPArray;<br></div><div>...</div><div>public extern static IntPtr proj_context_set_search_paths(ProjContextHandle ctx, int count_paths, [MarshalAs(ARRAYTYPE, ArraySubType = STRINGTYPE, SizeParamIndex = 1)] string[] paths);<br></div><div><br></div><div>Alas, LPUTF8Str is NOT supported with ArraySubType! So in order to conquer that problem, I ended up using a custom marshaller.</div><div>public extern static IntPtr proj_context_set_search_paths(ProjContextHandle ctx, int count_paths, [MarshalAs(UnmanagedType.CustomMarshaler, MarshalTypeRef = typeof(Utf8StringArrayMarshaler))] string[] paths);<br></div><div><br></div><div> internal class Utf8StringMarshaler : ICustomMarshaler {<br> private static readonly Utf8StringMarshaler _instance = new Utf8StringMarshaler();<br><br> public unsafe IntPtr MarshalManagedToNative(object strObj) {<br> if (strObj == null)<br> return IntPtr.Zero;<br> if (!(strObj is string str))<br> throw new ArgumentException("Value must be string", nameof(strObj));<br><br> return MarshalManagedValue(str);<br> }<br> public unsafe static IntPtr MarshalManagedValue(string str) {<br>#if NETSTANDARD2_1_OR_GREATER<br> //From core runtime's UTF8 string marshaller.<br> int exactByteCount = checked(Encoding.UTF8.GetByteCount(str) + 1); // + 1 for null terminator<br> byte* mem = (byte*)Marshal.AllocCoTaskMem(exactByteCount);<br> Span<byte> buffer = new(mem, exactByteCount);<br><br> int byteCount = Encoding.UTF8.GetBytes(str, buffer);<br> buffer[byteCount] = 0; // null-terminate<br> return (IntPtr)mem;<br>#else<br> var bytes = Encoding.UTF8.GetBytes(str);<br> var ptr = Marshal.AllocCoTaskMem(bytes.Length + 1);<br> Marshal.Copy(bytes, 0, ptr, bytes.Length);<br> Marshal.WriteByte(ptr, bytes.Length, 0);<br> return ptr;<br>#endif<br> }<br><br> public object MarshalNativeToManaged(IntPtr pNativeData) {<br> return MarshalUnmanagedValue(pNativeData);<br> }<br> public static string MarshalUnmanagedValue(IntPtr pNativeData) {<br> if (pNativeData == IntPtr.Zero)<br> return null;<br><br>#if NETSTANDARD2_1_OR_GREATER<br> return Marshal.PtrToStringUTF8(pNativeData);<br>#else<br> var bytes = new List<byte>(4096);<br> int offset = 0;<br> byte b;<br> do {<br> b = Marshal.ReadByte(pNativeData, offset);<br> if (b != 0) {<br> bytes.Add(b);<br> offset++;<br> }<br> } while (b != 0);<br><br> return bytes.Count > 0 ? Encoding.UTF8.GetString(bytes.ToArray(), 0, bytes.Count) : "";<br>#endif<br> }<br><br> public void CleanUpManagedData(object ManagedObj) {<br> }<br><br> public void CleanUpNativeData(IntPtr pNativeData) {<br> Marshal.FreeCoTaskMem(pNativeData);<br> }<br><br> public int GetNativeDataSize() {<br> return -1;<br> }<br><br> public static ICustomMarshaler GetInstance(string pstrCookie) {<br> return _instance;<br> }<br> }<br><br> internal class Utf8StringArrayMarshaler : ICustomMarshaler {<br><br> private static readonly Utf8StringArrayMarshaler _instance = new Utf8StringArrayMarshaler();<br><br> public unsafe IntPtr MarshalManagedToNative(object strObj) {<br> if (strObj == null)<br> return IntPtr.Zero;<br> if (!(strObj is string[] str))<br> throw new ArgumentException("Value must be string array", nameof(strObj));<br><br> //Write UTF-8 arrays for each entry in the string array<br> //Then end it with a nullptr.<br> var len = IntPtr.Size * str.Length;<br> var basePtr = Marshal.AllocHGlobal(len + IntPtr.Size);<br> var ptr = basePtr;<br> for (var i = 0; i < str.Length; i++) {<br> var addr = Utf8StringMarshaler.MarshalManagedValue(str[i]);<br> Marshal.WriteIntPtr(ptr, addr);<br> ptr += IntPtr.Size;<br> }<br> Marshal.WriteIntPtr(ptr, IntPtr.Zero);<br> return basePtr;<br> }<br><br> public object MarshalNativeToManaged(IntPtr pNativeData) {<br> if (pNativeData == IntPtr.Zero)<br> return null;<br><br> //We don't have any context on how long the string array will be.<br> var values = new List<string>();<br><br> //Read UTF8 strings until we hit nullptr.<br> var ptr = pNativeData;<br> var currValue = Marshal.ReadIntPtr(ptr);<br> while (currValue != IntPtr.Zero) {<br> var str = Utf8StringMarshaler.MarshalUnmanagedValue(currValue);<br> values.Add(str);<br><br> ptr += IntPtr.Size;<br> currValue = Marshal.ReadIntPtr(ptr);<br> }<br> return values.ToArray();<br> }<br><br> public void CleanUpManagedData(object ManagedObj) {<br> }<br><br> public void CleanUpNativeData(IntPtr pNativeData) {<br> if (pNativeData == IntPtr.Zero) {<br> return;<br> }<br><br> //Free the individual strings until we hit a nullptr.<br> var ptr = pNativeData;<br> var value = Marshal.ReadIntPtr(ptr);<br> while (value != IntPtr.Zero) {<br> Marshal.FreeCoTaskMem(value);<br> ptr += IntPtr.Size;<br> value = Marshal.ReadIntPtr(ptr);<br> }<br><br> //Free the array.<br> Marshal.FreeHGlobal(pNativeData);<br> }<br><br> public int GetNativeDataSize() {<br> return -1;<br> }<br><br> public static ICustomMarshaler GetInstance(string pstrCookie) {<br> return _instance;<br> }<br> }<br></div><div><br></div><div>I couldn't use the custom marshaller as a complete replacement though. You can't use them on struct fields. So I have to use LPUTF8Str on those.</div><div>public const UnmanagedType STRINGTYPE = UnmanagedType.LPUTF8Str;<br></div><div>...</div><div>[StructLayout(LayoutKind.Sequential, CharSet = CHARSET)]<br> public struct PROJ_UNIT_INFO {<br><br> [MarshalAs(STRINGTYPE)]<br> public string auth_name;<br><br> [MarshalAs(STRINGTYPE)]<br> public string code;<br><br> [MarshalAs(STRINGTYPE)]<br> public string name;<br><br> [MarshalAs(STRINGTYPE)]<br> public string category;<br><br> public double conv_factor;<br><br> [MarshalAs(STRINGTYPE)]<br> public string proj_short_name;<br><br> public int deprecated;<br><br> }<br></div><div><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Apr 6, 2023 at 3:31 PM Peter Townsend <<a href="mailto:peter.townsend@maplarge.com">peter.townsend@maplarge.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">Thanks, but I can't really use the SharpProj way. It's kinda using the .NET string as an intermediary. Plus I need to support a Linux build so I can't use C++/CLI anyway. The utf_8string method that takes in the .NET string works kinda similar to doing this w/o it:<div>std::string utf8_string(String^ v)<br>{<br> std::wstring_convert<std::codecvt_utf8_utf16<wchar_t>> conv;<br> pin_ptr<const wchar_t> pPath = PtrToStringChars(v);<br> std::wstring vstr(pPath);<br> std::string sstr(conv.to_bytes(vstr));<br> return sstr;<br>}<br><div>const char* convertResult4(const char* result) {<br> if (!result)<br> return result;<br><br> std::wstring_convert<std::codecvt_utf8_utf16<wchar_t>> conv;<br> std::wstring str = conv.from_bytes(result);<br> std::string sstr(conv.to_bytes(str));<br> ...<br>}<br></div></div><div><br></div><div>The std::wstring str contains the correctly encoded string, but turning it back to a const char* using std::string sstr.c_str() just garbles it back again.</div><div><br></div><div>I might have to just proxy everything over the managed/unmanaged pinvoke wall as a wchar_t* or just make everything IntPtrs and Marshal them that way.</div><div><br></div><div><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Apr 6, 2023 at 10:32 AM Bert Huijben <<a href="mailto:bert@qqmail.nl" target="_blank">bert@qqmail.nl</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><div lang="NL"><div><p class="MsoNormal"><span> Hi Peter,<u></u><u></u></span></p><p class="MsoNormal"><span><u></u> <u></u></span></p><p class="MsoNormal"><span lang="EN-US">When I needed proj for my work on my previous day-job, I spend a bit extra time and created a complete wrapping C# library that is still used there and a few other places. The wrapping is specifically targeted towards Windows, but works there with .Net Framework and .Net core. See <a href="https://github.com/ampscm/sharpproj/" target="_blank">https://github.com/ampscm/sharpproj/</a> (or just use SharpProj from NuGet)<u></u><u></u></span></p><p class="MsoNormal"><span lang="EN-US"><u></u> <u></u></span></p><p class="MsoNormal"><span lang="EN-US"><u></u> <u></u></span></p><p class="MsoNormal"><span lang="EN-US">The sample code I have on that page shows +- what you try here, so you should be able to use that to try your use-cases around encoding.<u></u><u></u></span></p><p class="MsoNormal"><span lang="EN-US"><u></u> <u></u></span></p><p class="MsoNormal"><span lang="EN-US">[[<u></u><u></u></span></p><p class="MsoNormal"><span lang="EN-US">using SharpProj;<u></u><u></u></span></p><p class="MsoNormal"><span lang="EN-US"><u></u> <u></u></span></p><p class="MsoNormal"><span lang="EN-US">using var rd = CoordinateReferenceSystem.CreateFromEpsg(28992);<u></u><u></u></span></p><p class="MsoNormal"><span lang="EN-US">using var wgs84 = CoordinateReferenceSystem.CreateFromEpsg(4326);<u></u><u></u></span></p><p class="MsoNormal"><span lang="EN-US"><u></u> <u></u></span></p><p class="MsoNormal"><span lang="EN-US">var area = rd.UsageArea;<u></u><u></u></span></p><p class="MsoNormal"><span lang="EN-US">Assert.AreEqual("Netherlands - onshore, including Waddenzee, Dutch Wadden Islands and 12-mile offshore coastal zone.", area.Name);<u></u><u></u></span></p><p class="MsoNormal"><span lang="EN-US"><u></u> <u></u></span></p><p class="MsoNormal"><span lang="EN-US">using (var t = CoordinateTransform.Create(rd, wgs84))<u></u><u></u></span></p><p class="MsoNormal"><span lang="EN-US">{<u></u><u></u></span></p><p class="MsoNormal"><span lang="EN-US"> var r = t.Apply(new PPoint(155000, 463000));<u></u><u></u></span></p><p class="MsoNormal"><span lang="EN-US"> Assert.AreEqual(new PPoint(52.155, 5.387), r.ToXY(3)); // Round to 3 decimals for easy testing<u></u><u></u></span></p><p class="MsoNormal"><span lang="EN-US">}<u></u><u></u></span></p><p class="MsoNormal"><span lang="EN-US">]]<u></u><u></u></span></p><p class="MsoNormal"><span lang="EN-US"><u></u> <u></u></span></p><p class="MsoNormal"><span lang="EN-US">If you pick EPSG 23031, you will see that the encodings work there.<u></u><u></u></span></p><p class="MsoNormal"><span lang="EN-US"><u></u> <u></u></span></p><p class="MsoNormal"><span lang="EN-US"><u></u> <u></u></span></p><p class="MsoNormal"><span lang="EN-US">You can check all the sourcecode too, if you just want to check how to get the en-/decoding to work. (It is all Apache licensed, so feel free to copy&paste… or provide pull requests if you want something added to the library)<u></u><u></u></span></p><p class="MsoNormal"><span lang="EN-US"><u></u> <u></u></span></p><p class="MsoNormal"><span lang="EN-US"> Bert<u></u><u></u></span></p><p class="MsoNormal"><span lang="EN-US"><u></u> <u></u></span></p><p class="MsoNormal"><span lang="EN-US"><u></u> <u></u></span></p><p class="MsoNormal"><span lang="EN-US"><u></u> <u></u></span></p><div><div style="border-right:none;border-bottom:none;border-left:none;border-top:1pt solid rgb(225,225,225);padding:3pt 0cm 0cm"><p class="MsoNormal"><b><span lang="EN-US">From:</span></b><span lang="EN-US"> PROJ <<a href="mailto:proj-bounces@lists.osgeo.org" target="_blank">proj-bounces@lists.osgeo.org</a>> <b>On Behalf Of </b>Even Rouault<br><b>Sent:</b> Wednesday, April 5, 2023 11:53 PM<br><b>To:</b> Peter Townsend <<a href="mailto:peter.townsend@maplarge.com" target="_blank">peter.townsend@maplarge.com</a>>; proj <<a href="mailto:proj@lists.osgeo.org" target="_blank">proj@lists.osgeo.org</a>><br><b>Subject:</b> Re: [PROJ] PROJ and Unicode on Windows<u></u><u></u></span></p></div></div><p class="MsoNormal"><u></u> <u></u></p><p>Peter,<u></u><u></u></p><p>there isn't any issue in your build. It is just that PROJ returns UTF-8 encoded strings and that the typical Windows console isn't configured to display UTF-8. Cf <a href="https://stackoverflow.com/questions/57131654/using-utf-8-encoding-chcp-65001-in-command-prompt-windows-powershell-window" target="_blank">https://stackoverflow.com/questions/57131654/using-utf-8-encoding-chcp-65001-in-command-prompt-windows-powershell-window</a> or similar issues<u></u><u></u></p><p>Even<u></u><u></u></p><div><p class="MsoNormal">Le 05/04/2023 à 23:44, Peter Townsend via PROJ a écrit :<u></u><u></u></p></div><blockquote style="margin-top:5pt;margin-bottom:5pt"><div><div><p class="MsoNormal">I've got a bit of an annoyance with my windows proj build. Hopefully it's not too hard to resolve as the world of char/wchar_t/etc. isn't something I'm terribly familiar with.<u></u><u></u></p></div><div><p class="MsoNormal"><u></u> <u></u></p></div><div><p class="MsoNormal">Take for example the area of use of EPSG:23031. On Linux it's fine, but on windows there's a unicode issue.<u></u><u></u></p></div><div><p class="MsoNormal"><u></u> <u></u></p></div><div><p class="MsoNormal">PJ* crs = proj_create(m_ctxt, "EPSG:23031");<br>ASSERT_NE(crs, nullptr);<br>ObjectKeeper keeper_crsH(crs);<br><br>double w, s, e, n;<br>const char* a;<br>proj_get_area_of_use(m_ctxt, crs, &w, &s, &e, &n, &a);<u></u><u></u></p></div><div><p class="MsoNormal"><u></u> <u></u></p></div><div><p class="MsoNormal">Contents of a:<u></u><u></u></p></div><p class="MsoNormal">"Europe - between 0°E and 6°E - Andorra; Denmark (North Sea); Germany offshore; Netherlands offshore; Norway including Svalbard - onshore and offshore; Spain - onshore (mainland and Balearic Islands); United Kingdom (UKCS) offshore."<br clear="all"><u></u><u></u></p><div><p class="MsoNormal"><u></u> <u></u></p></div><div><p class="MsoNormal">Is there a simple thing I'm overlooking in the build process that might clear up the encoding goof? Or do I need to do some bending over backwards with character manipulation?<u></u><u></u></p></div><div><p class="MsoNormal"><u></u> <u></u></p></div><div><p class="MsoNormal">This is the command line I'm using to build this example:<u></u><u></u></p></div><div><p class="MsoNormal">cmake -DBUILD_SHARED_LIBS=ON -DCMAKE_TOOLCHAIN_FILE=C:\dev\vcpkg\scripts\buildsystems\vcpkg.cmake ..<br>cmake --build . --config Debug -j 8<u></u><u></u></p></div><div><p class="MsoNormal"><u></u> <u></u></p></div><div><p class="MsoNormal">Thanks!<u></u><u></u></p></div><p class="MsoNormal"><span>-- </span><u></u><u></u></p><div><div><div><p class="MsoNormal">Peter Townsend<u></u><u></u></p></div><p class="MsoNormal">Senior Software Developer<u></u><u></u></p></div></div></div><p class="MsoNormal"><br><br><u></u><u></u></p><pre>_______________________________________________<u></u><u></u></pre><pre>PROJ mailing list<u></u><u></u></pre><pre><a href="mailto:PROJ@lists.osgeo.org" target="_blank">PROJ@lists.osgeo.org</a><u></u><u></u></pre><pre><a href="https://lists.osgeo.org/mailman/listinfo/proj" target="_blank">https://lists.osgeo.org/mailman/listinfo/proj</a><u></u><u></u></pre></blockquote><pre>-- <u></u><u></u></pre><pre><a href="http://www.spatialys.com" target="_blank">http://www.spatialys.com</a><u></u><u></u></pre><pre>My software is free, but my time generally not.<u></u><u></u></pre></div></div></div></blockquote></div><br clear="all"><div><br></div><span>-- </span><br><div dir="ltr"><div dir="ltr"><div>Peter Townsend<br></div>Senior Software Developer<br></div></div>
</blockquote></div><br clear="all"><div><br></div><span class="gmail_signature_prefix">-- </span><br><div dir="ltr" class="gmail_signature"><div dir="ltr"><div>Peter Townsend<br></div>Senior Software Developer<br></div></div>