<!DOCTYPE html>

<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

  </head>

  <body>

    <p><br>

    </p>

    <div class="moz-cite-prefix">Le 28/10/2024 à 17:01, Meyer, Jesse R.

      (GSFC-618.0)[SCIENCE SYSTEMS AND APPLICATIONS INC] via gdal-dev a

      écrit :<br>

    </div>

    <blockquote type="cite"

cite="mid:MN2PR09MB5932BFE8EC59E0A9AEC54694C94A2@MN2PR09MB5932.namprd09.prod.outlook.com">

      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

      <meta name="Generator"

        content="Microsoft Word 15 (filtered medium)">

      <style>@font-face

        {font-family:"Cambria Math";

        panose-1:2 4 5 3 5 4 6 3 2 4;}@font-face

        {font-family:Aptos;

        panose-1:2 11 0 4 2 2 2 2 2 4;}p.MsoNormal, li.MsoNormal, div.MsoNormal

        {margin:0in;

        font-size:11.0pt;

        font-family:"Aptos",sans-serif;

        mso-ligatures:standardcontextual;}span.EmailStyle17

        {mso-style-type:personal-compose;

        font-family:"Aptos",sans-serif;

        color:windowtext;}.MsoChpDefault

        {mso-style-type:export-only;

        font-size:11.0pt;}div.WordSection1

        {page:WordSection1;}</style>

      <div class="WordSection1">

        <p class="MsoNormal">I have two calls to gdal.Rasterize, each of

          which target a separate GDAL memory dataset but source the

          same OGR memory dataset, that I hoped could be ran in parallel

          using Python’s concurrent futures.  The idea being that each

          GDAL call unlocks the Python GIL, and performing read only

          operations on the vector database (except for storing memory

          for the results) could in principle be a safe and effective

          optimization, as the feature layers themselves are not

          mutated.  The SQL dialect is SQLite, so presumably the OGR

          dataset has to be converted to a SQLite (memory) database. 

          Technically SQLite supports multiple readers just fine, but

          this doesn’t mean GDAL/OGR does.  The multithreading

          documentation page doesn’t explicitly mention OGR / vector

          datasets but I presume they inherit similar stateful

          restrictions (Yes RFC 101 is coming).  However, running these

          SQL queries at the same times causes OGR to trip over itself

          (I presume OGR assumes only one query statement is being

          evaluated at the same time).<o:p></o:p></p>

        <p class="MsoNormal"><o:p> </o:p></p>

        <p class="MsoNormal">So I think the intended work around is

          either: accept this is as a serially dependent task, or copy

          the dataset and have each Rasterize() work on a copy, yes?</p>

      </div>

    </blockquote>

    I'm not clear if you use the same Python source vector dataset, or

    if you open your source dataset once for each thread ?  The first

    case is a big no no: anything could happen, including wrong results

    and crashes. One object per thread is the way to go. If the

    processing is very intensive on acquiring source features, you may

    hit a global lock at the SQLite level, but there isn't much we can

    do about that. Or you need to use multi-processing parallelization

    instead of multi-threading. But you certainly don't need to copy

    your source dataset.<br>

    <blockquote type="cite"

cite="mid:MN2PR09MB5932BFE8EC59E0A9AEC54694C94A2@MN2PR09MB5932.namprd09.prod.outlook.com">

      <div class="WordSection1">

        <p class="MsoNormal"><o:p></o:p></p>

        <p class="MsoNormal"><o:p> </o:p></p>

        <p class="MsoNormal">In the same spirit as RFC 101, which gives

          some thread safety to raster read-only workloads, is there

          interest in expanding this to vector datasets?<o:p></o:p></p>

      </div>

    </blockquote>

    <p>That would be tricky. What would be the expect result if a user

      would use GetNextFeature() on a thread-safe OGRLayer...: would

      users expect each thread to see all features or features would be

      distributed among calling threads ?</p>

    Even<span style="white-space: pre-wrap">

</span>

    <pre class="moz-signature" cols="72">-- 

<a class="moz-txt-link-freetext" href="http://www.spatialys.com">http://www.spatialys.com</a>

My software is free, but my time generally not.

Butcher of all kinds of standards, open or closed formats. At the end, this is just about bytes.</pre>

  </body>

</html>