<html xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Aptos;
panose-1:2 11 0 4 2 2 2 2 2 4;}
@font-face
{font-family:Consolas;
panose-1:2 11 6 9 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
font-size:11.0pt;
font-family:"Aptos",sans-serif;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
pre
{mso-style-priority:99;
mso-style-link:"HTML Preformatted Char";
margin:0in;
font-size:10.0pt;
font-family:"Courier New";}
span.HTMLPreformattedChar
{mso-style-name:"HTML Preformatted Char";
mso-style-priority:99;
mso-style-link:"HTML Preformatted";
font-family:Consolas;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;
mso-ligatures:none;}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
--></style>
</head>
<body lang="EN-US" link="blue" vlink="purple" style="word-wrap:break-word">
<div class="WordSection1">
<p class="MsoNormal">How can I “open” a handle to a pre-existing memory dataset? That sounds like it may work for me.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">As a matter of semantics, my sense of what GetNextFeature() would return would be a local view of the database on a per thread basis. Each thread would have its own cursor into the database, said another way.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<div>
<div>
<p class="MsoNormal"><span style="font-size:12.0pt">Best,<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:12.0pt">Jesse<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:12.0pt"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:12.0pt">Lead Computer Scientist<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:12.0pt">Science Systems and Applications, Inc.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:12.0pt">Dr Compton Tucker Team<o:p></o:p></span></p>
</div>
</div>
<p class="MsoNormal"><span style="font-size:12.0pt">NASA Goddard Space Flight Center</span><o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<div id="mail-editor-reference-message-container">
<div>
<div>
<div style="border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal" style="margin-bottom:12.0pt"><b><span style="color:black">From:
</span></b><span style="color:black">Even Rouault <even.rouault@spatialys.com><br>
<b>Date: </b>Monday, October 28, 2024 at 12:08</span><span style="font-family:"Arial",sans-serif;color:black"> </span><span style="color:black">PM<br>
<b>To: </b>Meyer, Jesse R. (GSFC-618.0)[SCIENCE SYSTEMS AND APPLICATIONS INC] <jesse.r.meyer@nasa.gov>, Meyer, Jesse R. (GSFC-618.0)[SCIENCE SYSTEMS AND APPLICATIONS INC] via gdal-dev <gdal-dev@lists.osgeo.org><br>
<b>Subject: </b>[EXTERNAL] Re: [gdal-dev] gdal.Rasterize with same OGR dataset from two python threads</span><span style="font-size:12.0pt;color:black"><o:p></o:p></span></p>
</div>
<table class="MsoNormalTable" border="1" cellspacing="0" cellpadding="0" align="left" style="border:solid black 1.5pt">
<tbody>
<tr>
<td width="100%" style="width:100.0%;border:none;background:#FFEB9C;padding:3.75pt 3.75pt 3.75pt 3.75pt">
<p class="MsoNormal" style="mso-element:frame;mso-element-frame-hspace:2.25pt;mso-element-wrap:around;mso-element-anchor-vertical:paragraph;mso-element-anchor-horizontal:column;mso-height-rule:exactly">
<b><span style="font-size:10.0pt;color:black">CAUTION:</span></b><span style="color:black">
</span><span style="font-size:10.0pt;color:black">This email originated from outside of NASA. Please take care when clicking links or opening attachments. Use the "Report Message" button to report suspicious messages to the NASA SOC.</span><span style="color:black">
</span></p>
</td>
</tr>
</tbody>
</table>
<p class="MsoNormal" style="margin-bottom:12.0pt"><br>
<br>
</p>
<div>
<p><o:p> </o:p></p>
<div>
<p class="MsoNormal">Le 28/10/2024 à 17:01, Meyer, Jesse R. (GSFC-618.0)[SCIENCE SYSTEMS AND APPLICATIONS INC] via gdal-dev a écrit :</p>
</div>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<div>
<p class="MsoNormal">I have two calls to gdal.Rasterize, each of which target a separate GDAL memory dataset but source the same OGR memory dataset, that I hoped could be ran in parallel using Python’s concurrent futures. The idea being that each GDAL call
unlocks the Python GIL, and performing read only operations on the vector database (except for storing memory for the results) could in principle be a safe and effective optimization, as the feature layers themselves are not mutated. The SQL dialect is SQLite,
so presumably the OGR dataset has to be converted to a SQLite (memory) database. Technically SQLite supports multiple readers just fine, but this doesn’t mean GDAL/OGR does. The multithreading documentation page doesn’t explicitly mention OGR / vector datasets
but I presume they inherit similar stateful restrictions (Yes RFC 101 is coming). However, running these SQL queries at the same times causes OGR to trip over itself (I presume OGR assumes only one query statement is being evaluated at the same time).</p>
<p class="MsoNormal"> </p>
<p class="MsoNormal">So I think the intended work around is either: accept this is as a serially dependent task, or copy the dataset and have each Rasterize() work on a copy, yes?</p>
</div>
</blockquote>
<p class="MsoNormal"><span style="font-size:12.0pt">I'm not clear if you use the same Python source vector dataset, or if you open your source dataset once for each thread ? The first case is a big no no: anything could happen, including wrong results and
crashes. One object per thread is the way to go. If the processing is very intensive on acquiring source features, you may hit a global lock at the SQLite level, but there isn't much we can do about that. Or you need to use multi-processing parallelization
instead of multi-threading. But you certainly don't need to copy your source dataset.<br>
<br>
<o:p></o:p></span></p>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<div>
<p class="MsoNormal"> </p>
<p class="MsoNormal">In the same spirit as RFC 101, which gives some thread safety to raster read-only workloads, is there interest in expanding this to vector datasets?</p>
</div>
</blockquote>
<p>That would be tricky. What would be the expect result if a user would use GetNextFeature() on a thread-safe OGRLayer...: would users expect each thread to see all features or features would be distributed among calling threads ?</p>
<p class="MsoNormal"><span style="font-size:12.0pt">Even <o:p></o:p></span></p>
<pre>-- </pre>
<pre><a href="http://www.spatialys.com/">http://www.spatialys.com</a></pre>
<pre>My software is free, but my time generally not.</pre>
<pre>Butcher of all kinds of standards, open or closed formats. At the end, this is just about bytes.</pre>
</div>
</div>
</div>
</div>
</div>
</body>
</html>