[pdal] Metrics calculation

Mon Sep 7 08:33:46 PDT 2020

> On 4 Sep 2020, at 16:21, Howard Butler <howard at hobu.co> wrote:
>
>> On Mon, Aug 31, 2020 at 9:07 AM Peder Axensten <Peder.Axensten at slu.se> wrote:
>>
>> I’ve started with the tool to produce raster metrics. I copied the files for writers.gdal and renamed [what I think is] relevant parts.
>> Q1: Is this a good way to start, do you think? Or should I use something else to start from?
>>
>> Does writers.gdal not already do what you want? What capability is missing, exactly? Specific statistics?
>
> One of the issues with writers.gdal is its statistics aren't for the cell, but rather for the neighborhood region. This ticket documents the issue, which we can hopefully enhance in a future release. https://github.com/PDAL/PDAL/issues/3215

I thought that writers.gdal might be a good starting point for a writers.raster_metrics as they both “convert” a point cloud to a raster. So far I basically just renamed things and it seems to work – the next step would be to replace code to change the actual functionality. Would you suggest a different class as a starting point? What would you suggest as a starting point for a writers.text_metrics? The tool we have today takes a directory with las files and a csv file as input and produces a copy of the original csv file with additional columns containing metric values. Would that fit with the pdal way of doing things, if we read the points from a pdal pipe?

>> Q3: Do you think this is a good way to do the transition to pdal or would you suggest a better way?
>>
>> I don't understand what you're trying to do, exactly, so I can't answer.
>>
>> We have one tool to produce raster metrics from laz files and another to calculate plot metrics from a set of laz files and a csv file containing coordinates and radius of plots. They are coded in C++17 and structured as a library where the actual tools are rather short files that handles command line options and then make a call into the library. You specify what metrics you need in the command line options (see the —help output below). The library structure makes it pretty straightforward to add yet more metrics. We use a tool implemented in R for the actual modelling/prediction. These tools are executed from a make script to process tens of thousands of files in a paralleled manner.
>>
>> PDAL targets C++11. If you're making code for the public, it must build under C++11 for now.
>
> Which C++17 constructs are you using? Just filesystem stuff? PDAL has some utilities to handle filesystem activity to keep the bar at C++11.

Yes, using pdal would mean that I could loose a lot of the present support code, such as [raster] file handling. I don’t think my code for the actual metrics handling would be difficult to make C++11 compliant. And as it seems that pdal could handle most of the supporting stuff, I think raster_metrics would be ok.

I don’t yet know yet how much support pdal has for [text] tables, as needed by text_metrics. Obviously, I would need to read and write csv files, and access and add columns. The code I have for that today uses string_view liberally, charconv, fold-expressions (e.g. for efficient string concatenation), and probably other things such as if constexpr. I also use some external libraries (boost and fmt).

I guess that I would first rewrite the code to take advantage of the pdal ecosystem and then see what happens when I restrict the compiler to C++11 and take it from there. But I dread the moment…

>>  Q4: Would there be a general interest for such drivers (writers.gdal.metrics and writers.text.metrics)? We’d be happy to eventually make the code open source.
>
> There is definitely interest for FUSION-like forestry metrics from a number of PDAL using sub-communities. I think it has been an outstanding question about how those statistics are delivered. For example, what formats are suitable, convenient, and sufficient for per-point or per-cell moments to be passed on to the next consumer in the processing chain. FUSION, for example, generates a blizzard of ascii files of which most do not need to be used. PDAL's metadata system doesn't seem like a good fit for such a thing.

I think that for raster_metrics using gdal for raster export is pretty natural? We opted for producing one-band files rather than one multi-band file, as it makes it simpler to see what is what through file name extensions. Separate files for separate things is also important when using make scripts to process many files, as we do.

For text_metrics we felt that using csv-like files fitted the rest of our processing environment best. (Actually we use semicolon delimited files as comma is used as the fractional delimiter of numbers in Swedish typography, sometimes causing unpredictable results with localised software…) As all our field data are circular plots, we look for east, north, and radius in the header line. It would not be difficult to use square or rectangular plots using different tags, but that would be rather hands on? To really generalise, I guess that an area concept would be needed along the lines of the point concept. Then different readers/writers (csv, hdf, json, etc.) and pipes could be implemented in a general way, using dimensions for auxiliary data?

Best regards,

Peder Axensten
Research engineer

Remote Sensing
Department of Forest Resource Management
Swedish University of Agricultural Sciences
SE-901 83 Umeå
Visiting address: Skogsmarksgränd
Phone: +46 90 786 85 00
peder.axensten at slu.se, www.slu.se/srh

The Department of Forest Resource Management is environmentally certified in accordance with ISO 14001.

---
När du skickar e-post till SLU så innebär detta att SLU behandlar dina personuppgifter. För att läsa mer om hur detta går till, klicka här <https://www.slu.se/om-slu/kontakta-slu/personuppgifter/>
E-mailing SLU will result in SLU processing your personal data. For more information on how this is done, click here <https://www.slu.se/en/about-slu/contact-slu/personal-data/>