<!DOCTYPE html>
<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
  <body>
    <p><br>
    </p>
    <div class="moz-cite-prefix">Le 23/07/2024 à 21:08, Meyer, Jesse R.
      (GSFC-618.0)[SCIENCE SYSTEMS AND APPLICATIONS INC] a écrit :<br>
    </div>
    <blockquote type="cite"
cite="mid:MN2PR09MB593200969465335E9E7EA6B9C9A92@MN2PR09MB5932.namprd09.prod.outlook.com">
      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
      <meta name="Generator"
        content="Microsoft Word 15 (filtered medium)">
      <style>@font-face
        {font-family:"Cambria Math";
        panose-1:2 4 5 3 5 4 6 3 2 4;}@font-face
        {font-family:Aptos;
        panose-1:2 11 0 4 2 2 2 2 2 4;}@font-face
        {font-family:Consolas;
        panose-1:2 11 6 9 2 2 4 3 2 4;}p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0in;
        font-size:11.0pt;
        font-family:"Aptos",sans-serif;}a:link, span.MsoHyperlink
        {mso-style-priority:99;
        color:blue;
        text-decoration:underline;}pre
        {mso-style-priority:99;
        mso-style-link:"HTML Preformatted Char";
        margin:0in;
        font-size:10.0pt;
        font-family:"Courier New";}span.HTMLPreformattedChar
        {mso-style-name:"HTML Preformatted Char";
        mso-style-priority:99;
        mso-style-link:"HTML Preformatted";
        font-family:"Consolas",serif;}span.EmailStyle23
        {mso-style-type:personal-reply;
        font-family:"Aptos",sans-serif;
        color:windowtext;}.MsoChpDefault
        {mso-style-type:export-only;
        font-size:10.0pt;
        mso-ligatures:none;}div.WordSection1
        {page:WordSection1;}</style>
      <div class="WordSection1">
        <p class="MsoNormal">Excellent, thanks Even!  Do you recall what
          the runtime was before these changes on your test system?
        </p>
      </div>
    </blockquote>
    I killed the process at about half an hour. Don't recall the
    progress it reached, maybe 40%-50%.<br>
    <blockquote type="cite"
cite="mid:MN2PR09MB593200969465335E9E7EA6B9C9A92@MN2PR09MB5932.namprd09.prod.outlook.com">
      <div class="WordSection1">
        <p class="MsoNormal"><o:p></o:p></p>
        <p class="MsoNormal"><o:p> </o:p></p>
        <div id="mail-editor-reference-message-container">
          <div>
            <div
style="border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0in 0in 0in">
              <p class="MsoNormal" style="margin-bottom:12.0pt"><b><span
                    style="color:black">From:
                  </span></b><span style="color:black">Even Rouault
                  <a class="moz-txt-link-rfc2396E" href="mailto:even.rouault@spatialys.com"><even.rouault@spatialys.com></a><br>
                  <b>Date: </b>Tuesday, July 23, 2024 at 3:00</span><span
style="font-family:"Arial",sans-serif;color:black"> </span><span
                  style="color:black">PM<br>
                  <b>To: </b>Meyer, Jesse R. (GSFC-618.0)[SCIENCE
                  SYSTEMS AND APPLICATIONS INC]
                  <a class="moz-txt-link-rfc2396E" href="mailto:jesse.r.meyer@nasa.gov"><jesse.r.meyer@nasa.gov></a>, Meyer, Jesse R.
                  (GSFC-618.0)[SCIENCE SYSTEMS AND APPLICATIONS INC] via
                  gdal-dev <a class="moz-txt-link-rfc2396E" href="mailto:gdal-dev@lists.osgeo.org"><gdal-dev@lists.osgeo.org></a><br>
                  <b>Subject: </b>[EXTERNAL] Re: [gdal-dev] Expected
                  runtime of polygonize (GDAL 3.9.0) for few very large
                  features.</span><span
                  style="font-size:12.0pt;color:black"><o:p></o:p></span></p>
            </div>
            <table class="MsoNormalTable"
              style="border:solid black 1.5pt" cellspacing="0"
              cellpadding="0" border="1" align="left">
              <tbody>
                <tr>
                  <td
style="width:100.0%;border:none;background:#FFEB9C;padding:3.75pt 3.75pt 3.75pt 3.75pt"
                    width="100%">
                    <p class="MsoNormal"
style="mso-element:frame;mso-element-frame-hspace:2.25pt;mso-element-wrap:around;mso-element-anchor-vertical:paragraph;mso-element-anchor-horizontal:column;mso-height-rule:exactly">
                      <b><span style="font-size:10.0pt;color:black">CAUTION:</span></b><span
                        style="color:black">
                      </span><span style="font-size:10.0pt;color:black">This
                        email originated from outside of NASA.  Please
                        take care when clicking links or opening
                        attachments.  Use the "Report Message" button to
                        report suspicious messages to the NASA SOC.</span><span
                        style="color:black">
                      </span><o:p></o:p></p>
                  </td>
                </tr>
              </tbody>
            </table>
            <p class="MsoNormal" style="margin-bottom:12.0pt"><br>
              <br>
              <o:p></o:p></p>
            <div>
              <p>Hi,<o:p></o:p></p>
              <p>I've got a chance to have a look at your test dataset.
                In <a href="https://github.com/OSGeo/gdal/pull/10477"
                  moz-do-not-send="true" class="moz-txt-link-freetext">
                  https://github.com/OSGeo/gdal/pull/10477</a>, I've
                reduced the runtime to 8 minutes (with GeoParquet
                output, without spatial sorting), by optimizing some
                implementation details. I believe this could be further
                reduced as most of the time is still spent in
                malloc/free of temporary objects (the output is 90
                million polygons!) and some objects could be reused, but
                that would be more extensive changes<o:p></o:p></p>
              <p>Even<o:p></o:p></p>
              <div>
                <p class="MsoNormal">Le 01/07/2024 à 18:40, Meyer, Jesse
                  R. (GSFC-618.0)[SCIENCE SYSTEMS AND APPLICATIONS INC]
                  via gdal-dev a écrit :<o:p></o:p></p>
              </div>
              <blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
                <div>
                  <p class="MsoNormal">Hi,<o:p></o:p></p>
                  <p class="MsoNormal"> <o:p></o:p></p>
                  <p class="MsoNormal">We’ve encountered a few images
                    with what seems like pathological performance
                    problems with polygonise.  The details below are a
                    report from another developer that I haven’t yet
                    independently verified.<o:p></o:p></p>
                  <p class="MsoNormal"> <o:p></o:p></p>
                  <p class="MsoNormal">We threshold a raster image to a
                    binary mask in a memory dataset, use that as its own
                    mask to mask out the background.<o:p></o:p></p>
                  <p class="MsoNormal">gdal.Polygonize(nn_mem_band,
                    nn_mem_band, ogr_mem_lyr, -1)<o:p></o:p></p>
                  <p class="MsoNormal"> <o:p></o:p></p>
                  <p class="MsoNormal">We have a number of 32k x 32k
                    raster images that feature number of very large
                    same-valued regions (some as large as 80% of the
                    entire raster).  We’re seeing ~10hrs on a modern
                    workstation to complete the line of code above. 
                    OpenCV can apparently construct a connected
                    components list in mere seconds, on the same
                    workstation and image, so we’re considering
                    constructing the OGR geometries directly from those
                    as a temporary work around.<o:p></o:p></p>
                  <p class="MsoNormal"> <o:p></o:p></p>
                  <p class="MsoNormal">Is this situation a known pitfall
                    with the current algorithm / data structures behind
                    Polygonize?<o:p></o:p></p>
                  <p class="MsoNormal"> <o:p></o:p></p>
                  <p class="MsoNormal">I’m able to share the problematic
                    tile(s) if of interest,<o:p></o:p></p>
                  <p class="MsoNormal">Best<o:p></o:p></p>
                  <p class="MsoNormal">Jesse<o:p></o:p></p>
                </div>
                <p class="MsoNormal"><span style="font-size:12.0pt"><br>
                    <br>
                    <o:p></o:p></span></p>
                <pre>_______________________________________________<o:p></o:p></pre>
                <pre>gdal-dev mailing list<o:p></o:p></pre>
                <pre><a href="mailto:gdal-dev@lists.osgeo.org"
                moz-do-not-send="true" class="moz-txt-link-freetext">gdal-dev@lists.osgeo.org</a><o:p></o:p></pre>
                <pre><a
                href="https://lists.osgeo.org/mailman/listinfo/gdal-dev"
                moz-do-not-send="true" class="moz-txt-link-freetext">https://lists.osgeo.org/mailman/listinfo/gdal-dev</a><o:p></o:p></pre>
              </blockquote>
              <pre>-- <o:p></o:p></pre>
              <pre><a href="http://www.spatialys.com/"
              moz-do-not-send="true" class="moz-txt-link-freetext">http://www.spatialys.com</a><o:p></o:p></pre>
              <pre>My software is free, but my time generally not.<o:p></o:p></pre>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    <pre class="moz-signature" cols="72">-- 
<a class="moz-txt-link-freetext" href="http://www.spatialys.com">http://www.spatialys.com</a>
My software is free, but my time generally not.</pre>
  </body>
</html>