[pdal] writers.gdal: median
Peder Axensten
Peder.Axensten at slu.se
Mon Jan 31 09:44:15 PST 2022
I don’t know if it is of use for you, but we estimate forrest variables across Sweden based on metrics (including percentiles) calculated from the national laser scanning. You can download our tools as a Docker here:
https://hub.docker.com/r/axensten/slu
The tools are for our internal use and only so-so documented, but you may use them if you find them useful. The system consists of a number of tools, among which one is for calculating raster metrics and another is for circular plots. Raster metrics is implemented as a pdal plugin, pdal_plugin_filter_raster_metrics. Plot metrics is a specific command line tool, pax-plots. Presently the following metrics are implemented (listed from pax-plots –help):
--metrics You may choose one or more from the metrics and metric sets:
--- Metrics -------------
count number of all values
count_1ret number of first returns
count_gel number of all values >= given level
count_1ret_gel number of first returns >= given level
prop_gel proportion of all values >= given level by all values
prop_1ret_gel proportion of first returns >= given level by first returns
mean_gel sum/N of all values >= given level
mean2_gel sum^2/N of all values >= given level
mean3_gel sum^3/N of all values >= given level
rootmean2_gel mean2^(1/2) of all values >= given level
rootmean3_gel mean3^(1/3) of all values >= given level
sample_std_dev_gel sample standard deviation of all values >= given level
sample_variance_gel sample variance of all values >= given level
sample_skewness_gel sample skewness of all values >= given level
sample_kurtosis_gel sample kurtosis of all values >= given level
count_ge#cm_gel number of all values >= # cm, where # is any integer of all values >= given level
p#_gel percentile # of all values >= given level , where # is integer in [0, 100]
L1_gel L1-moment (L-mean) of all values >= given level
L2_gel L2-moment (L-scale) of all values >= given level
L3_gel L3-moment (L-scewness) of all values >= given level
L4_gel L4-moment (L-kurtosis) of all values >= given level
L3_ratio_gel L3-moment ratio (L3/L2) of all values >= given level
L4_ratio_gel L4-moment ratio (L4/L2) of all values >= given level
mad_gel median absolute deviation (MAD) of all values >= given level
--nilsson_level For most metrics, ignore z-values below this. [scalar value='0.0']
To run pdal for raster metrics we use (copied from the make script that runs it all):
{"pipeline":[
{
"type":"filters.raster_metrics",
"resolution”:”12.5",
"metrics":"$(strip $(metrics_set))",
"nilsson_level”:”2.0",
"gdalopts":"BIGTIFF=IF_SAFER,COMPRESS=DEFLATE",
"data_type":"float"
}
] }
And with pdal arguments:
--input=“<source>"
--output=“<temporary_dir>/null.bull"
--writer="writers.null"
--filters.raster_metrics.dest=“<metric_dest>"
--metadata=“<metric_metadata_dest>.json"
Best regards,
Peder Axensten
Systems Developer
Remote Sensing
Department of Forest Resource Management
Swedish University of Agricultural Sciences
SE-901 83 Umeå
Visiting address: Skogsmarksgränd
Phone: +46 90 786 85 00
peder.axensten at slu.se, www.slu.se/srh
The Department of Forest Resource Management is environmentally certified in accordance with ISO 14001.
> On 31 Jan 2022, at 17:34, Jim Klassen <klassen.js at gmail.com> wrote:
>
> I would not propose to include this in all (and the commit I linked to requires it to be explicitly selected).
>
> I don't think one has to be "very sensitive" about memory use for memory use to be a problem. I would say that as implemented that one has to be very careful about memory use (point cloud size, raster output size, raster radius and resolution parameters) to be successful even with 128 GB available to (a single core) for PDAL.
>
> I think the best way to go if going forward with this would be to put median in a non-streamable stage so it can take multiple passes through the point cloud. Keeping the point cloud in memory (or even writing the point cloud to a temp LAS) plus a bounded array for each cell is likely going to be smaller than storing multiple copies of the "Z" attribute for each point (as in the default case where radius selects points outside the cell boundary). And the infrastructure already exists in PDAL for non-streamable stages, vs needing to come up with something new.
>
> On 1/31/22 09:28, Andrew Bell wrote:
>> My concern would be that this computation is crazy-expensive WRT memory. The default `output_type` is all, and people might be in for quite a surprise if this gets added. Some users are very sensitive about memory use. One could change things such that the rasters themselves were, say, memory-mapped files, but this gets pretty difficult with this addition, where you don't know how many items are in each cell.
>>
>> I think this is pretty hard to do well without writing quite a bit of code.
>>
>> On Mon, Jan 31, 2022 at 10:14 AM Howard Butler <howard at hobu.co> wrote:
>>
>>
>> > On Jan 28, 2022, at 6:23 PM, Jim Klassen <klassen.js at gmail.com> wrote:
>> >
>> > Is there any interest in adding a median (and possibly Q1 and Q3) statistic to writers.gdal?
>>
>> As long as the overhead associated with computing it is opt-in, I think this would a very useful addition.
>>
>> > I'm not sure this memory limitation would be easy to document clearly and I presume this is why median isn't already implemented. I certainly would not include it by default in the "all" mode.
>> >
>> > There may be ways to make this more memory friendly if multiple passes through the point cloud would be allowed, but this is counter to how the existing writers.gdal stage is structured.
>>
>> Related to earlier traffic, I think the distinction in behavior between "cell count" and "search window" count has some value for some applications. It would be nice to support both behaviors.
>>
>> _______________________________________________
>> pdal mailing list
>> pdal at lists.osgeo.org
>> https://lists.osgeo.org/mailman/listinfo/pdal
>>
>>
>> --
>> Andrew Bell
>> andrew.bell.ia at gmail.com
>
> _______________________________________________
> pdal mailing list
> pdal at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/pdal
---
När du skickar e-post till SLU så innebär detta att SLU behandlar dina personuppgifter. För att läsa mer om hur detta går till, klicka här <https://www.slu.se/om-slu/kontakta-slu/personuppgifter/>
E-mailing SLU will result in SLU processing your personal data. For more information on how this is done, click here <https://www.slu.se/en/about-slu/contact-slu/personal-data/>
More information about the pdal
mailing list