[pdal] EPT:// prefix issue with PDAL 2.2

Matt Beckley beckley at unavco.org
Tue Dec 15 10:25:11 PST 2020


Hi Connor,

Thanks for the quick and informative reply.  I will implement the filename
as you suggested.  A quick follow-up:  Is streaming enabled on readers.ept?

---------------------------
Matthew Beckley
Data Engineer
UNAVCO/OpenTopography
beckley at unavco.org
cell: 301-982-9819


On Tue, Dec 15, 2020 at 11:21 AM Connor Manning <connor at hobu.co> wrote:

> In one of the last few releases (not sure which) we tried to move away
> from the "ept://" pseudo-protocol and instead use the presence of
> "ept.json" at the end to signify the EPT reader.  So please try using a
> filename of
> https://s3-us-west-2.amazonaws.com/usgs-lidar-public/USGS_LPC_CA_Central_Valley_2017_LAS_2019/ept.json
> instead - this is the recommended format from now on.  I think we kept
> support for both formats for at least one release.
>
> This change was for a few reasons: when accessing over a network, the
> double protocol (ept://http://...) is strange, and also that using the
> root directory rather than the ept.json filename means that your "filename"
> option is not a real file, e.g.
> https://s3-us-west-2.amazonaws.com/usgs-lidar-public/USGS_LPC_CA_Central_Valley_2017_LAS_2019
> is a 404, but
> https://s3-us-west-2.amazonaws.com/usgs-lidar-public/USGS_LPC_CA_Central_Valley_2017_LAS_2019/ept.json
> is an actual file.
>
> I wouldn't worry much about the file size difference here since your point
> counts match: since the EPT reader runs in a multi-threaded fashion, the
> order of points may vary between runs, which leads to slight differences in
> the compression.  You could add a "filters.sort" after the EPT reader to
> counteract this (for LAZ data I'd recommend sorting by GpsTime and maybe
> secondarily by ReturnNumber).
>
> I'm not sure why your filesource_id would be changing, so maybe open a
> Github issue on that one.
>
> - Connor
>
> On Tue, Dec 15, 2020 at 12:04 PM Matt Beckley <beckley at unavco.org> wrote:
>
>> Hello,
>>
>> It seems like when reading the ept data from the AWS 3DEP entwine bucket
>> the reader will not work unless I add the prefix, "ept://" to the URL (see
>> examples below).  This applies only to PDAL v2.2, and it is not clear if
>> this is a dataset-specific issue.  PDAL 2.1 will run with or without the
>> ept:// prefix, but it has the odd result that the filesizes will differ
>> slightly if using ept:// in the prefix or not.  Point counts are the same
>> whether or not you use ept:// with PDAL 2.1, but the "filesource_id"
>> parameter differs, so the filesize differences are probably due to slight
>> header differences.  In regards to the PDAL2.2 EPT issue, so far this seems
>> to happen on the following AWS 3DEP Entwine datasets:
>>
>> USGS LPC CA Central Valley 2017 LAS 2019
>> CO_Southwest_NRCS_B2_2018
>> TX WestTexas B1 2018
>> NM SouthCentral B8 2018
>>
>> *My question:*  For PDAL v2.2, should I always use the EPT:// prefix
>> when using readers.ept?  (seems related to:
>> https://github.com/PDAL/PDAL/pull/3174).  Also, as an aside, is
>> streaming available for readers.ept?  Documentation doesn't indicate it is,
>> but this issue: https://github.com/PDAL/PDAL/issues/2439 makes it seem
>> that maybe it is?  I'm uncertain how to test this.
>>
>> Any info you could provide would be most appreciated.
>>
>> Test1:  PDAL 2.2 WITHOUT EPT:// Prefix (PDAL installed via isolated conda
>> environment):
>>
>> {
>>
>>
>>     "pipeline": [{
>>
>>
>>         "type": "readers.ept",
>>
>>
>>         "filename": "
>> https://s3-us-west-2.amazonaws.com/usgs-lidar-public/USGS_LPC_CA_Central_Valley_2017_LAS_2019",
>>
>>         "bounds": "([-13484500, -13484200], [4653000,4654200])"
>>
>>
>>     },
>>
>>
>>        "points_CA_noept.laz"]}
>>
>> pdal pipeline pipeline.json gives error:
>>
>> PDAL: readers.ept: Could not read from
>> s3-us-west-2.amazonaws.com/usgs-lidar-public/USGS_LPC_CA_Central_Valley_2017_LAS_2019
>>
>> Test2:  PDAL 2.2 WITH EPT:// Prefix (PDAL installed via isolated conda
>> environment):
>> {
>>
>>
>>     "pipeline": [{
>>
>>
>>         "type": "readers.ept",
>>
>>
>>         "filename": "ept://
>> https://s3-us-west-2.amazonaws.com/usgs-lidar-public/USGS_LPC_CA_Central_Valley_2017_LAS_2019",
>>
>>         "bounds": "([-13484500, -13484200], [4653000,4654200])"
>>
>>
>>     },
>>
>>
>>        "points_CA_wept.laz"]}
>>
>> pdal pipeline pipeline.json runs successfully
>>
>>
>> Test3:  PDAL 2.1 WITHOUT EPT:// Prefix (PDAL installed via isolated conda
>> environment):
>> {
>>     "pipeline": [{
>>         "type": "readers.ept",
>>         "filename": "ept://
>> https://s3-us-west-2.amazonaws.com/usgs-lidar-public/USGS_LPC_CA_Central_Valley_2017_LAS_2019
>> ",
>>         "bounds": "([-13484500, -13484200], [4653000,4654200])"
>>     },
>>        "points_CA_wept_v21.laz"]}
>>
>> pdal pipeline pipeline.json runs successfully, filesize is: 3453289
>> bytes.  "count": 956938
>>
>>
>> Test4:  PDAL 2.1 WITH EPT:// Prefix (PDAL installed via isolated conda
>> environment):
>> {
>>
>>
>>     "pipeline": [{
>>
>>
>>         "type": "readers.ept",
>>
>>
>>         "filename": "
>> https://s3-us-west-2.amazonaws.com/usgs-lidar-public/USGS_LPC_CA_Central_Valley_2017_LAS_2019",
>>
>>         "bounds": "([-13484500, -13484200], [4653000,4654200])"
>>
>>
>>     },
>>
>>
>>        "points_CA_NOept_v21.laz"]}
>>
>> pdal pipeline pipeline.json runs successfully, but filesize is different:
>>  3479637 bytes. "count": 956938
>>
>>
>> Point counts for results from PDAL 2.1 run match.  Only difference is
>> "filesource_id".  Version without EPT:// prefix has filesource_id=0, while
>> with EPT:// prefix "filesource_id": 26982.
>> ---------------------------
>> Matthew Beckley
>> Data Engineer
>> UNAVCO/OpenTopography
>> beckley at unavco.org
>> cell: 301-982-9819
>> _______________________________________________
>> pdal mailing list
>> pdal at lists.osgeo.org
>> https://lists.osgeo.org/mailman/listinfo/pdal
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/pdal/attachments/20201215/6c11dc51/attachment.html>


More information about the pdal mailing list