[pdal] EPT:// prefix issue with PDAL 2.2

Connor Manning connor at hobu.co
Tue Dec 15 10:21:24 PST 2020


In one of the last few releases (not sure which) we tried to move away from
the "ept://" pseudo-protocol and instead use the presence of "ept.json" at
the end to signify the EPT reader.  So please try using a filename of
https://s3-us-west-2.amazonaws.com/usgs-lidar-public/USGS_LPC_CA_Central_Valley_2017_LAS_2019/ept.json
instead - this is the recommended format from now on.  I think we kept
support for both formats for at least one release.

This change was for a few reasons: when accessing over a network, the
double protocol (ept://http://...) is strange, and also that using the root
directory rather than the ept.json filename means that your "filename"
option is not a real file, e.g.
https://s3-us-west-2.amazonaws.com/usgs-lidar-public/USGS_LPC_CA_Central_Valley_2017_LAS_2019
is a 404, but
https://s3-us-west-2.amazonaws.com/usgs-lidar-public/USGS_LPC_CA_Central_Valley_2017_LAS_2019/ept.json
is an actual file.

I wouldn't worry much about the file size difference here since your point
counts match: since the EPT reader runs in a multi-threaded fashion, the
order of points may vary between runs, which leads to slight differences in
the compression.  You could add a "filters.sort" after the EPT reader to
counteract this (for LAZ data I'd recommend sorting by GpsTime and maybe
secondarily by ReturnNumber).

I'm not sure why your filesource_id would be changing, so maybe open a
Github issue on that one.

- Connor

On Tue, Dec 15, 2020 at 12:04 PM Matt Beckley <beckley at unavco.org> wrote:

> Hello,
>
> It seems like when reading the ept data from the AWS 3DEP entwine bucket
> the reader will not work unless I add the prefix, "ept://" to the URL (see
> examples below).  This applies only to PDAL v2.2, and it is not clear if
> this is a dataset-specific issue.  PDAL 2.1 will run with or without the
> ept:// prefix, but it has the odd result that the filesizes will differ
> slightly if using ept:// in the prefix or not.  Point counts are the same
> whether or not you use ept:// with PDAL 2.1, but the "filesource_id"
> parameter differs, so the filesize differences are probably due to slight
> header differences.  In regards to the PDAL2.2 EPT issue, so far this seems
> to happen on the following AWS 3DEP Entwine datasets:
>
> USGS LPC CA Central Valley 2017 LAS 2019
> CO_Southwest_NRCS_B2_2018
> TX WestTexas B1 2018
> NM SouthCentral B8 2018
>
> *My question:*  For PDAL v2.2, should I always use the EPT:// prefix when
> using readers.ept?  (seems related to:
> https://github.com/PDAL/PDAL/pull/3174).  Also, as an aside, is streaming
> available for readers.ept?  Documentation doesn't indicate it is, but this
> issue: https://github.com/PDAL/PDAL/issues/2439 makes it seem that maybe
> it is?  I'm uncertain how to test this.
>
> Any info you could provide would be most appreciated.
>
> Test1:  PDAL 2.2 WITHOUT EPT:// Prefix (PDAL installed via isolated conda
> environment):
>
> {
>
>
>     "pipeline": [{
>
>
>         "type": "readers.ept",
>
>
>         "filename": "
> https://s3-us-west-2.amazonaws.com/usgs-lidar-public/USGS_LPC_CA_Central_Valley_2017_LAS_2019",
>
>         "bounds": "([-13484500, -13484200], [4653000,4654200])"
>
>
>     },
>
>
>        "points_CA_noept.laz"]}
>
> pdal pipeline pipeline.json gives error:
>
> PDAL: readers.ept: Could not read from
> s3-us-west-2.amazonaws.com/usgs-lidar-public/USGS_LPC_CA_Central_Valley_2017_LAS_2019
>
> Test2:  PDAL 2.2 WITH EPT:// Prefix (PDAL installed via isolated conda
> environment):
> {
>
>
>     "pipeline": [{
>
>
>         "type": "readers.ept",
>
>
>         "filename": "ept://
> https://s3-us-west-2.amazonaws.com/usgs-lidar-public/USGS_LPC_CA_Central_Valley_2017_LAS_2019",
>
>         "bounds": "([-13484500, -13484200], [4653000,4654200])"
>
>
>     },
>
>
>        "points_CA_wept.laz"]}
>
> pdal pipeline pipeline.json runs successfully
>
>
> Test3:  PDAL 2.1 WITHOUT EPT:// Prefix (PDAL installed via isolated conda
> environment):
> {
>     "pipeline": [{
>         "type": "readers.ept",
>         "filename": "ept://
> https://s3-us-west-2.amazonaws.com/usgs-lidar-public/USGS_LPC_CA_Central_Valley_2017_LAS_2019
> ",
>         "bounds": "([-13484500, -13484200], [4653000,4654200])"
>     },
>        "points_CA_wept_v21.laz"]}
>
> pdal pipeline pipeline.json runs successfully, filesize is: 3453289 bytes.
>  "count": 956938
>
>
> Test4:  PDAL 2.1 WITH EPT:// Prefix (PDAL installed via isolated conda
> environment):
> {
>
>
>     "pipeline": [{
>
>
>         "type": "readers.ept",
>
>
>         "filename": "
> https://s3-us-west-2.amazonaws.com/usgs-lidar-public/USGS_LPC_CA_Central_Valley_2017_LAS_2019",
>
>         "bounds": "([-13484500, -13484200], [4653000,4654200])"
>
>
>     },
>
>
>        "points_CA_NOept_v21.laz"]}
>
> pdal pipeline pipeline.json runs successfully, but filesize is different:
>  3479637 bytes. "count": 956938
>
>
> Point counts for results from PDAL 2.1 run match.  Only difference is
> "filesource_id".  Version without EPT:// prefix has filesource_id=0, while
> with EPT:// prefix "filesource_id": 26982.
> ---------------------------
> Matthew Beckley
> Data Engineer
> UNAVCO/OpenTopography
> beckley at unavco.org
> cell: 301-982-9819
> _______________________________________________
> pdal mailing list
> pdal at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/pdal
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/pdal/attachments/20201215/5d845640/attachment-0001.html>


More information about the pdal mailing list