<html xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Aptos;
panose-1:2 11 0 4 2 2 2 2 2 4;}
@font-face
{font-family:Consolas;
panose-1:2 11 6 9 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
font-size:11.0pt;
font-family:"Aptos",sans-serif;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
pre
{mso-style-priority:99;
mso-style-link:"HTML Preformatted Char";
margin:0in;
font-size:10.0pt;
font-family:"Courier New";}
span.HTMLPreformattedChar
{mso-style-name:"HTML Preformatted Char";
mso-style-priority:99;
mso-style-link:"HTML Preformatted";
font-family:"Consolas",serif;}
span.EmailStyle23
{mso-style-type:personal-reply;
font-family:"Aptos",sans-serif;
color:windowtext;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;
mso-ligatures:none;}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
--></style>
</head>
<body lang="EN-US" link="blue" vlink="purple" style="word-wrap:break-word">
<div class="WordSection1">
<p class="MsoNormal">Excellent, thanks Even! Do you recall what the runtime was before these changes on your test system?
<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<div id="mail-editor-reference-message-container">
<div>
<div style="border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal" style="margin-bottom:12.0pt"><b><span style="color:black">From:
</span></b><span style="color:black">Even Rouault <even.rouault@spatialys.com><br>
<b>Date: </b>Tuesday, July 23, 2024 at 3:00</span><span style="font-family:"Arial",sans-serif;color:black"> </span><span style="color:black">PM<br>
<b>To: </b>Meyer, Jesse R. (GSFC-618.0)[SCIENCE SYSTEMS AND APPLICATIONS INC] <jesse.r.meyer@nasa.gov>, Meyer, Jesse R. (GSFC-618.0)[SCIENCE SYSTEMS AND APPLICATIONS INC] via gdal-dev <gdal-dev@lists.osgeo.org><br>
<b>Subject: </b>[EXTERNAL] Re: [gdal-dev] Expected runtime of polygonize (GDAL 3.9.0) for few very large features.</span><span style="font-size:12.0pt;color:black"><o:p></o:p></span></p>
</div>
<table class="MsoNormalTable" border="1" cellspacing="0" cellpadding="0" align="left" style="border:solid black 1.5pt">
<tbody>
<tr>
<td width="100%" style="width:100.0%;border:none;background:#FFEB9C;padding:3.75pt 3.75pt 3.75pt 3.75pt">
<p class="MsoNormal" style="mso-element:frame;mso-element-frame-hspace:2.25pt;mso-element-wrap:around;mso-element-anchor-vertical:paragraph;mso-element-anchor-horizontal:column;mso-height-rule:exactly">
<b><span style="font-size:10.0pt;color:black">CAUTION:</span></b><span style="color:black">
</span><span style="font-size:10.0pt;color:black">This email originated from outside of NASA. Please take care when clicking links or opening attachments. Use the "Report Message" button to report suspicious messages to the NASA SOC.</span><span style="color:black">
</span><o:p></o:p></p>
</td>
</tr>
</tbody>
</table>
<p class="MsoNormal" style="margin-bottom:12.0pt"><br>
<br>
<o:p></o:p></p>
<div>
<p>Hi,<o:p></o:p></p>
<p>I've got a chance to have a look at your test dataset. In <a href="https://github.com/OSGeo/gdal/pull/10477">
https://github.com/OSGeo/gdal/pull/10477</a>, I've reduced the runtime to 8 minutes (with GeoParquet output, without spatial sorting), by optimizing some implementation details. I believe this could be further reduced as most of the time is still spent in malloc/free
of temporary objects (the output is 90 million polygons!) and some objects could be reused, but that would be more extensive changes<o:p></o:p></p>
<p>Even<o:p></o:p></p>
<div>
<p class="MsoNormal">Le 01/07/2024 à 18:40, Meyer, Jesse R. (GSFC-618.0)[SCIENCE SYSTEMS AND APPLICATIONS INC] via gdal-dev a écrit :<o:p></o:p></p>
</div>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<div>
<p class="MsoNormal">Hi,<o:p></o:p></p>
<p class="MsoNormal"> <o:p></o:p></p>
<p class="MsoNormal">We’ve encountered a few images with what seems like pathological performance problems with polygonise. The details below are a report from another developer that I haven’t yet independently verified.<o:p></o:p></p>
<p class="MsoNormal"> <o:p></o:p></p>
<p class="MsoNormal">We threshold a raster image to a binary mask in a memory dataset, use that as its own mask to mask out the background.<o:p></o:p></p>
<p class="MsoNormal">gdal.Polygonize(nn_mem_band, nn_mem_band, ogr_mem_lyr, -1)<o:p></o:p></p>
<p class="MsoNormal"> <o:p></o:p></p>
<p class="MsoNormal">We have a number of 32k x 32k raster images that feature number of very large same-valued regions (some as large as 80% of the entire raster). We’re seeing ~10hrs on a modern workstation to complete the line of code above. OpenCV can
apparently construct a connected components list in mere seconds, on the same workstation and image, so we’re considering constructing the OGR geometries directly from those as a temporary work around.<o:p></o:p></p>
<p class="MsoNormal"> <o:p></o:p></p>
<p class="MsoNormal">Is this situation a known pitfall with the current algorithm / data structures behind Polygonize?<o:p></o:p></p>
<p class="MsoNormal"> <o:p></o:p></p>
<p class="MsoNormal">I’m able to share the problematic tile(s) if of interest,<o:p></o:p></p>
<p class="MsoNormal">Best<o:p></o:p></p>
<p class="MsoNormal">Jesse<o:p></o:p></p>
</div>
<p class="MsoNormal"><span style="font-size:12.0pt"><br>
<br>
<o:p></o:p></span></p>
<pre>_______________________________________________<o:p></o:p></pre>
<pre>gdal-dev mailing list<o:p></o:p></pre>
<pre><a href="mailto:gdal-dev@lists.osgeo.org">gdal-dev@lists.osgeo.org</a><o:p></o:p></pre>
<pre><a href="https://lists.osgeo.org/mailman/listinfo/gdal-dev">https://lists.osgeo.org/mailman/listinfo/gdal-dev</a><o:p></o:p></pre>
</blockquote>
<pre>-- <o:p></o:p></pre>
<pre><a href="http://www.spatialys.com/">http://www.spatialys.com</a><o:p></o:p></pre>
<pre>My software is free, but my time generally not.<o:p></o:p></pre>
</div>
</div>
</div>
</div>
</body>
</html>