[pdal] Correct destination file guarantee?

Charles Karney charles at karney.com
Tue Nov 10 05:53:42 PST 2020


Back in the day (maybe with the old MIT ITS operating system), "file close + rename" was some sort of system call.  But it doesn't look like C++ supports this in any systematic way.  So in the meantime, I think the most elegant solution is Andrew's.  You, the end user, specify a temporary file name in the same directory as your desired output file.  On successful completion of pdal, you rename the file.  (This avoids then the problem of a disk-full failure and a truncated file if you *copy* the temporary file from a temporary directory.)

  --Charles

On 11/10/20, 8:17 AM, "pdal on behalf of Peder Axensten" <pdal-bounces at lists.osgeo.org on behalf of Peder.Axensten at slu.se> wrote:

    I absolutely agree that a copy is to costly. The temporary file must be on the same device as the destination, so that it may be moved. The easiest way to do that is to add a randomised suffix to the destination path that is removed when all is done. If the process crashes the temporary file remains, but has a suffix that clearly indicates it as temporary. If the process is interrupted, temporary files are removed. A more elegant solution would be to use a system-supplied temporary directory on the specified device, but I have not found any system support for this.

    I implemented a C++ class that I use in my own tools whenever I save stuff to a file. The destructor removes the temporary file, if there is one.
    Very schematic (there might be typos):

    class Temppath {
    public:

    // Save the final path and create and save a path with a randomised suffix.
    Temppath( path final );

    Temppath() = delete;
    Temppath( const Temppath & ) = delete;
    Temppath( Temppath && ) = default;
    Temppath & operator=( const Temppath & )  = delete;
    Temppath & operator=( Temppath && ) = default;

    // Return the temporary file name.
    path temporary() const noexcept;

    // Return the final file name.
    path final() const noexcept;

    // Move the file temporary() to final(), if there is a temporary() file..
    void done() const noexcept {
    // Will not throw as it uses std::error_code.
    std::error_code ec;
    std::filesystem::rename( temporary(), final(), ec );
    }

    ~ Temppath() {
    // Will not throw as it uses std::error_code.
    std::error_code ec;
    std::filesystem::remove( temporary(), ec );
    }

    };

    This way temporary files are removed if the process is interrupted (but not if it crashes).
    Use it like this.

    const Temppath file{ final_path };
    {
    auto fptr = std::open( file.temporary(), … );
    // Do the stuff.
    }
    file.done();

    Best regards,

    Peder Axensten
    Research engineer

    Remote Sensing
    Department of Forest Resource Management
    Swedish University of Agricultural Sciences
    SE-901 83 Umeå
    Visiting address: Skogsmarksgränd
    Phone: +46 90 786 85 00
    peder.axensten at slu.se, www.slu.se/srh

    The Department of Forest Resource Management is environmentally certified in accordance with ISO 14001.

    > On 10 Nov 2020, at 13:22, Andrew Bell <andrew.bell.ia at gmail.com> wrote:
    >
    >
    > My take on this is that it's expensive to write to a temporary file and then copy it. Some output files very are large. Since you, the user, are the one interrupting the process, it seems that it's up to you to clean up. This behavior is also consistent with most other programs.
    >
    > On Tue, Nov 10, 2020 at 5:50 AM Peder Axensten <Peder.Axensten at slu.se> wrote:
    > Hi!
    >
    > If pdal is interrupted while saving to the destination file, it might result in a corrupt file. This is not unreasonable, but could be avoided.
    >
    > We use a make script to process large amounts of files and sometimes we have to interrupt processing for different reasons. We then risk to have corrupt files that make will consider final when rerunning the script, so the files are left in a corrupt state.
    >
    > Would it be a good idea to make pdal by default save the contents to a temporary file and then move the temporary file to the destination file? This way either a correct file is produced or nothing. I'm implementing this in the make script – it is somewhat cumbersome but will work ok, I guess.
    >
    > Isn’t it a very attractive and useful guarantee: if the destination file is produced, then it is correct?
    >
    > Best regards,
    >
    > Peder Axensten
    > Research engineer
    >
    > Remote Sensing
    > Department of Forest Resource Management
    > Swedish University of Agricultural Sciences
    > SE-901 83 Umeå
    > Visiting address: Skogsmarksgränd
    > Phone: +46 90 786 85 00
    > peder.axensten at slu.se, www.slu.se/srh
    >
    > The Department of Forest Resource Management is environmentally certified in accordance with ISO 14001.
    >
    > ---
    > När du skickar e-post till SLU så innebär detta att SLU behandlar dina personuppgifter. För att läsa mer om hur detta går till, klicka här <https://www.slu.se/om-slu/kontakta-slu/personuppgifter/>
    > E-mailing SLU will result in SLU processing your personal data. For more information on how this is done, click here <https://www.slu.se/en/about-slu/contact-slu/personal-data/>
    > _______________________________________________
    > pdal mailing list
    > pdal at lists.osgeo.org
    > https://lists.osgeo.org/mailman/listinfo/pdal
    >
    >
    > --
    > Andrew Bell
    > andrew.bell.ia at gmail.com

    ---
    När du skickar e-post till SLU så innebär detta att SLU behandlar dina personuppgifter. För att läsa mer om hur detta går till, klicka här <https://www.slu.se/om-slu/kontakta-slu/personuppgifter/>
    E-mailing SLU will result in SLU processing your personal data. For more information on how this is done, click here <https://www.slu.se/en/about-slu/contact-slu/personal-data/>
    _______________________________________________
    pdal mailing list
    pdal at lists.osgeo.org
    https://lists.osgeo.org/mailman/listinfo/pdal


More information about the pdal mailing list