[pdal] Correct destination file guarantee?

Jim Klassen klassen.js at gmail.com
Tue Nov 10 12:29:39 PST 2020


I use the following general structure in many of my scripts:

#!/bin/bash
set -euo pipefail  # exit the script if a command doesn't return success
pdal .... --writers.las.filename=$out.tmp.laz
mv $out.tmp.laz $out.laz


This has an advantage that I can place the temporary file either next to where the final result should be, or I can put it in a faster scratch space (usually a local SSD or ramdisk instead of network storage) which can help because it is usually most efficient to write to the network in large contiguous blocks instead of piecemeal.  Which way is optimal depends on the system and processing task at hand.  To me this is much less surprising behavior than pdal writing to some other file behind my back.


Alternatively, I have also used GNU make to drive the processing as it can automatically remove the output file if the command that was supposed to generate it fails.  For this to work the output file has to be called for by a make rule and having set ".DELETE_ON_ERROR".

On 11/10/20 6:22 AM, Andrew Bell wrote:
>
> My take on this is that it's expensive to write to a temporary file and then copy it. Some output files very are large. Since you, the user, are the one interrupting the process, it seems that it's up to you to clean up. This behavior is also consistent with most other programs.
>
> On Tue, Nov 10, 2020 at 5:50 AM Peder Axensten <Peder.Axensten at slu.se <mailto:Peder.Axensten at slu.se>> wrote:
>
>     Hi!
>
>     If pdal is interrupted while saving to the destination file, it might result in a corrupt file. This is not unreasonable, but could be avoided.
>
>     We use a make script to process large amounts of files and sometimes we have to interrupt processing for different reasons. We then risk to have corrupt files that make will consider final when rerunning the script, so the files are left in a corrupt state.
>
>     Would it be a good idea to make pdal by default save the contents to a temporary file and then move the temporary file to the destination file? This way either a correct file is produced or nothing. I'm implementing this in the make script – it is somewhat cumbersome but will work ok, I guess.
>
>     Isn’t it a very attractive and useful guarantee: if the destination file is produced, then it is correct?
>
>     Best regards,
>
>     Peder Axensten
>     Research engineer
>
>     Remote Sensing
>     Department of Forest Resource Management
>     Swedish University of Agricultural Sciences
>     SE-901 83 Umeå
>     Visiting address: Skogsmarksgränd
>     Phone: +46 90 786 85 00
>     peder.axensten at slu.se <mailto:peder.axensten at slu.se>, www.slu.se/srh <http://www.slu.se/srh>
>
>     The Department of Forest Resource Management is environmentally certified in accordance with ISO 14001.
>
>     ---
>     När du skickar e-post till SLU så innebär detta att SLU behandlar dina personuppgifter. För att läsa mer om hur detta går till, klicka här <https://www.slu.se/om-slu/kontakta-slu/personuppgifter/>
>     E-mailing SLU will result in SLU processing your personal data. For more information on how this is done, click here <https://www.slu.se/en/about-slu/contact-slu/personal-data/>
>     _______________________________________________
>     pdal mailing list
>     pdal at lists.osgeo.org <mailto:pdal at lists.osgeo.org>
>     https://lists.osgeo.org/mailman/listinfo/pdal
>
>
>
> -- 
> Andrew Bell
> andrew.bell.ia at gmail.com <mailto:andrew.bell.ia at gmail.com>
>
> _______________________________________________
> pdal mailing list
> pdal at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/pdal

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/pdal/attachments/20201110/73baf29a/attachment.html>


More information about the pdal mailing list