[pdal] EPT:// prefix issue with PDAL 2.2
Connor Manning
connor at hobu.co
Tue Dec 15 10:21:24 PST 2020
In one of the last few releases (not sure which) we tried to move away from
the "ept://" pseudo-protocol and instead use the presence of "ept.json" at
the end to signify the EPT reader. So please try using a filename of
https://s3-us-west-2.amazonaws.com/usgs-lidar-public/USGS_LPC_CA_Central_Valley_2017_LAS_2019/ept.json
instead - this is the recommended format from now on. I think we kept
support for both formats for at least one release.
This change was for a few reasons: when accessing over a network, the
double protocol (ept://http://...) is strange, and also that using the root
directory rather than the ept.json filename means that your "filename"
option is not a real file, e.g.
https://s3-us-west-2.amazonaws.com/usgs-lidar-public/USGS_LPC_CA_Central_Valley_2017_LAS_2019
is a 404, but
https://s3-us-west-2.amazonaws.com/usgs-lidar-public/USGS_LPC_CA_Central_Valley_2017_LAS_2019/ept.json
is an actual file.
I wouldn't worry much about the file size difference here since your point
counts match: since the EPT reader runs in a multi-threaded fashion, the
order of points may vary between runs, which leads to slight differences in
the compression. You could add a "filters.sort" after the EPT reader to
counteract this (for LAZ data I'd recommend sorting by GpsTime and maybe
secondarily by ReturnNumber).
I'm not sure why your filesource_id would be changing, so maybe open a
Github issue on that one.
- Connor
On Tue, Dec 15, 2020 at 12:04 PM Matt Beckley <beckley at unavco.org> wrote:
> Hello,
>
> It seems like when reading the ept data from the AWS 3DEP entwine bucket
> the reader will not work unless I add the prefix, "ept://" to the URL (see
> examples below). This applies only to PDAL v2.2, and it is not clear if
> this is a dataset-specific issue. PDAL 2.1 will run with or without the
> ept:// prefix, but it has the odd result that the filesizes will differ
> slightly if using ept:// in the prefix or not. Point counts are the same
> whether or not you use ept:// with PDAL 2.1, but the "filesource_id"
> parameter differs, so the filesize differences are probably due to slight
> header differences. In regards to the PDAL2.2 EPT issue, so far this seems
> to happen on the following AWS 3DEP Entwine datasets:
>
> USGS LPC CA Central Valley 2017 LAS 2019
> CO_Southwest_NRCS_B2_2018
> TX WestTexas B1 2018
> NM SouthCentral B8 2018
>
> *My question:* For PDAL v2.2, should I always use the EPT:// prefix when
> using readers.ept? (seems related to:
> https://github.com/PDAL/PDAL/pull/3174). Also, as an aside, is streaming
> available for readers.ept? Documentation doesn't indicate it is, but this
> issue: https://github.com/PDAL/PDAL/issues/2439 makes it seem that maybe
> it is? I'm uncertain how to test this.
>
> Any info you could provide would be most appreciated.
>
> Test1: PDAL 2.2 WITHOUT EPT:// Prefix (PDAL installed via isolated conda
> environment):
>
> {
>
>
> "pipeline": [{
>
>
> "type": "readers.ept",
>
>
> "filename": "
> https://s3-us-west-2.amazonaws.com/usgs-lidar-public/USGS_LPC_CA_Central_Valley_2017_LAS_2019",
>
> "bounds": "([-13484500, -13484200], [4653000,4654200])"
>
>
> },
>
>
> "points_CA_noept.laz"]}
>
> pdal pipeline pipeline.json gives error:
>
> PDAL: readers.ept: Could not read from
> s3-us-west-2.amazonaws.com/usgs-lidar-public/USGS_LPC_CA_Central_Valley_2017_LAS_2019
>
> Test2: PDAL 2.2 WITH EPT:// Prefix (PDAL installed via isolated conda
> environment):
> {
>
>
> "pipeline": [{
>
>
> "type": "readers.ept",
>
>
> "filename": "ept://
> https://s3-us-west-2.amazonaws.com/usgs-lidar-public/USGS_LPC_CA_Central_Valley_2017_LAS_2019",
>
> "bounds": "([-13484500, -13484200], [4653000,4654200])"
>
>
> },
>
>
> "points_CA_wept.laz"]}
>
> pdal pipeline pipeline.json runs successfully
>
>
> Test3: PDAL 2.1 WITHOUT EPT:// Prefix (PDAL installed via isolated conda
> environment):
> {
> "pipeline": [{
> "type": "readers.ept",
> "filename": "ept://
> https://s3-us-west-2.amazonaws.com/usgs-lidar-public/USGS_LPC_CA_Central_Valley_2017_LAS_2019
> ",
> "bounds": "([-13484500, -13484200], [4653000,4654200])"
> },
> "points_CA_wept_v21.laz"]}
>
> pdal pipeline pipeline.json runs successfully, filesize is: 3453289 bytes.
> "count": 956938
>
>
> Test4: PDAL 2.1 WITH EPT:// Prefix (PDAL installed via isolated conda
> environment):
> {
>
>
> "pipeline": [{
>
>
> "type": "readers.ept",
>
>
> "filename": "
> https://s3-us-west-2.amazonaws.com/usgs-lidar-public/USGS_LPC_CA_Central_Valley_2017_LAS_2019",
>
> "bounds": "([-13484500, -13484200], [4653000,4654200])"
>
>
> },
>
>
> "points_CA_NOept_v21.laz"]}
>
> pdal pipeline pipeline.json runs successfully, but filesize is different:
> 3479637 bytes. "count": 956938
>
>
> Point counts for results from PDAL 2.1 run match. Only difference is
> "filesource_id". Version without EPT:// prefix has filesource_id=0, while
> with EPT:// prefix "filesource_id": 26982.
> ---------------------------
> Matthew Beckley
> Data Engineer
> UNAVCO/OpenTopography
> beckley at unavco.org
> cell: 301-982-9819
> _______________________________________________
> pdal mailing list
> pdal at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/pdal
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/pdal/attachments/20201215/5d845640/attachment-0001.html>
More information about the pdal
mailing list