<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
I would not propose to include this in all (and the commit I linked
to requires it to be explicitly selected).<br>
<br>
I don't think one has to be "very sensitive" about memory use for
memory use to be a problem. I would say that as implemented that
one has to be very careful about memory use (point cloud size,
raster output size, raster radius and resolution parameters) to be
successful even with 128 GB available to (a single core) for PDAL.<br>
<br>
I think the best way to go if going forward with this would be to
put median in a non-streamable stage so it can take multiple passes
through the point cloud. Keeping the point cloud in memory (or even
writing the point cloud to a temp LAS) plus a bounded array for each
cell is likely going to be smaller than storing multiple copies of
the "Z" attribute for each point (as in the default case where
radius selects points outside the cell boundary). And the
infrastructure already exists in PDAL for non-streamable stages, vs
needing to come up with something new.<br>
<br>
<div class="moz-cite-prefix">On 1/31/22 09:28, Andrew Bell wrote:<br>
</div>
<blockquote type="cite"
cite="mid:CACJ51z1b=y_tPCHSBTu73qRtdS3sf0P3fQFC4dEFeyEv5Sh3Kw@mail.gmail.com">
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<div dir="ltr">My concern would be that this computation is
crazy-expensive WRT memory. The default `output_type` is all,
and people might be in for quite a surprise if this gets added.
Some users are very sensitive about memory use. One could change
things such that the rasters themselves were, say, memory-mapped
files, but this gets pretty difficult with this addition, where
you don't know how many items are in each cell.
<div><br>
</div>
<div>I think this is pretty hard to do well without writing
quite a bit of code.</div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Mon, Jan 31, 2022 at 10:14
AM Howard Butler <<a href="mailto:howard@hobu.co"
moz-do-not-send="true" class="moz-txt-link-freetext">howard@hobu.co</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><br>
<br>
> On Jan 28, 2022, at 6:23 PM, Jim Klassen <<a
href="mailto:klassen.js@gmail.com" target="_blank"
moz-do-not-send="true" class="moz-txt-link-freetext">klassen.js@gmail.com</a>>
wrote:<br>
> <br>
> Is there any interest in adding a median (and possibly Q1
and Q3) statistic to writers.gdal?<br>
<br>
As long as the overhead associated with computing it is
opt-in, I think this would a very useful addition. <br>
<br>
> I'm not sure this memory limitation would be easy to
document clearly and I presume this is why median isn't
already implemented. I certainly would not include it by
default in the "all" mode.<br>
> <br>
> There may be ways to make this more memory friendly if
multiple passes through the point cloud would be allowed, but
this is counter to how the existing writers.gdal stage is
structured.<br>
<br>
Related to earlier traffic, I think the distinction in
behavior between "cell count" and "search window" count has
some value for some applications. It would be nice to support
both behaviors.<br>
<br>
_______________________________________________<br>
pdal mailing list<br>
<a href="mailto:pdal@lists.osgeo.org" target="_blank"
moz-do-not-send="true" class="moz-txt-link-freetext">pdal@lists.osgeo.org</a><br>
<a href="https://lists.osgeo.org/mailman/listinfo/pdal"
rel="noreferrer" target="_blank" moz-do-not-send="true"
class="moz-txt-link-freetext">https://lists.osgeo.org/mailman/listinfo/pdal</a><br>
</blockquote>
</div>
<br clear="all">
<div><br>
</div>
-- <br>
<div dir="ltr" class="gmail_signature">Andrew Bell<br>
<a href="mailto:andrew.bell.ia@gmail.com" target="_blank"
moz-do-not-send="true" class="moz-txt-link-freetext">andrew.bell.ia@gmail.com</a></div>
</blockquote>
<br>
</body>
</html>