[pdal] EPT:// prefix issue with PDAL 2.2

Connor Manning connor at hobu.co
Tue Dec 15 10:30:54 PST 2020


Yes it is, I see that the "streamable" tag was missing from the
documentation for the EPT reader and I've just now added it, so that should
be displayed the next time the docs are generated.

On Tue, Dec 15, 2020 at 12:25 PM Matt Beckley <beckley at unavco.org> wrote:

> Hi Connor,
>
> Thanks for the quick and informative reply.  I will implement the filename
> as you suggested.  A quick follow-up:  Is streaming enabled on readers.ept?
>
> ---------------------------
> Matthew Beckley
> Data Engineer
> UNAVCO/OpenTopography
> beckley at unavco.org
> cell: 301-982-9819
>
>
> On Tue, Dec 15, 2020 at 11:21 AM Connor Manning <connor at hobu.co> wrote:
>
>> In one of the last few releases (not sure which) we tried to move away
>> from the "ept://" pseudo-protocol and instead use the presence of
>> "ept.json" at the end to signify the EPT reader.  So please try using a
>> filename of
>> https://s3-us-west-2.amazonaws.com/usgs-lidar-public/USGS_LPC_CA_Central_Valley_2017_LAS_2019/ept.json
>> instead - this is the recommended format from now on.  I think we kept
>> support for both formats for at least one release.
>>
>> This change was for a few reasons: when accessing over a network, the
>> double protocol (ept://http://...) is strange, and also that using the
>> root directory rather than the ept.json filename means that your "filename"
>> option is not a real file, e.g.
>> https://s3-us-west-2.amazonaws.com/usgs-lidar-public/USGS_LPC_CA_Central_Valley_2017_LAS_2019
>> is a 404, but
>> https://s3-us-west-2.amazonaws.com/usgs-lidar-public/USGS_LPC_CA_Central_Valley_2017_LAS_2019/ept.json
>> is an actual file.
>>
>> I wouldn't worry much about the file size difference here since your
>> point counts match: since the EPT reader runs in a multi-threaded fashion,
>> the order of points may vary between runs, which leads to slight
>> differences in the compression.  You could add a "filters.sort" after the
>> EPT reader to counteract this (for LAZ data I'd recommend sorting by
>> GpsTime and maybe secondarily by ReturnNumber).
>>
>> I'm not sure why your filesource_id would be changing, so maybe open a
>> Github issue on that one.
>>
>> - Connor
>>
>> On Tue, Dec 15, 2020 at 12:04 PM Matt Beckley <beckley at unavco.org> wrote:
>>
>>> Hello,
>>>
>>> It seems like when reading the ept data from the AWS 3DEP entwine bucket
>>> the reader will not work unless I add the prefix, "ept://" to the URL (see
>>> examples below).  This applies only to PDAL v2.2, and it is not clear if
>>> this is a dataset-specific issue.  PDAL 2.1 will run with or without the
>>> ept:// prefix, but it has the odd result that the filesizes will differ
>>> slightly if using ept:// in the prefix or not.  Point counts are the same
>>> whether or not you use ept:// with PDAL 2.1, but the "filesource_id"
>>> parameter differs, so the filesize differences are probably due to slight
>>> header differences.  In regards to the PDAL2.2 EPT issue, so far this seems
>>> to happen on the following AWS 3DEP Entwine datasets:
>>>
>>> USGS LPC CA Central Valley 2017 LAS 2019
>>> CO_Southwest_NRCS_B2_2018
>>> TX WestTexas B1 2018
>>> NM SouthCentral B8 2018
>>>
>>> *My question:*  For PDAL v2.2, should I always use the EPT:// prefix
>>> when using readers.ept?  (seems related to:
>>> https://github.com/PDAL/PDAL/pull/3174).  Also, as an aside, is
>>> streaming available for readers.ept?  Documentation doesn't indicate it is,
>>> but this issue: https://github.com/PDAL/PDAL/issues/2439 makes it seem
>>> that maybe it is?  I'm uncertain how to test this.
>>>
>>> Any info you could provide would be most appreciated.
>>>
>>> Test1:  PDAL 2.2 WITHOUT EPT:// Prefix (PDAL installed via isolated
>>> conda environment):
>>>
>>> {
>>>
>>>
>>>     "pipeline": [{
>>>
>>>
>>>         "type": "readers.ept",
>>>
>>>
>>>         "filename": "
>>> https://s3-us-west-2.amazonaws.com/usgs-lidar-public/USGS_LPC_CA_Central_Valley_2017_LAS_2019",
>>>
>>>         "bounds": "([-13484500, -13484200], [4653000,4654200])"
>>>
>>>
>>>     },
>>>
>>>
>>>        "points_CA_noept.laz"]}
>>>
>>> pdal pipeline pipeline.json gives error:
>>>
>>> PDAL: readers.ept: Could not read from
>>> s3-us-west-2.amazonaws.com/usgs-lidar-public/USGS_LPC_CA_Central_Valley_2017_LAS_2019
>>>
>>> Test2:  PDAL 2.2 WITH EPT:// Prefix (PDAL installed via isolated conda
>>> environment):
>>> {
>>>
>>>
>>>     "pipeline": [{
>>>
>>>
>>>         "type": "readers.ept",
>>>
>>>
>>>         "filename": "ept://
>>> https://s3-us-west-2.amazonaws.com/usgs-lidar-public/USGS_LPC_CA_Central_Valley_2017_LAS_2019",
>>>
>>>         "bounds": "([-13484500, -13484200], [4653000,4654200])"
>>>
>>>
>>>     },
>>>
>>>
>>>        "points_CA_wept.laz"]}
>>>
>>> pdal pipeline pipeline.json runs successfully
>>>
>>>
>>> Test3:  PDAL 2.1 WITHOUT EPT:// Prefix (PDAL installed via isolated
>>> conda environment):
>>> {
>>>     "pipeline": [{
>>>         "type": "readers.ept",
>>>         "filename": "ept://
>>> https://s3-us-west-2.amazonaws.com/usgs-lidar-public/USGS_LPC_CA_Central_Valley_2017_LAS_2019
>>> ",
>>>         "bounds": "([-13484500, -13484200], [4653000,4654200])"
>>>     },
>>>        "points_CA_wept_v21.laz"]}
>>>
>>> pdal pipeline pipeline.json runs successfully, filesize is: 3453289
>>> bytes.  "count": 956938
>>>
>>>
>>> Test4:  PDAL 2.1 WITH EPT:// Prefix (PDAL installed via isolated conda
>>> environment):
>>> {
>>>
>>>
>>>     "pipeline": [{
>>>
>>>
>>>         "type": "readers.ept",
>>>
>>>
>>>         "filename": "
>>> https://s3-us-west-2.amazonaws.com/usgs-lidar-public/USGS_LPC_CA_Central_Valley_2017_LAS_2019",
>>>
>>>         "bounds": "([-13484500, -13484200], [4653000,4654200])"
>>>
>>>
>>>     },
>>>
>>>
>>>        "points_CA_NOept_v21.laz"]}
>>>
>>> pdal pipeline pipeline.json runs successfully, but filesize is
>>> different:  3479637 bytes. "count": 956938
>>>
>>>
>>> Point counts for results from PDAL 2.1 run match.  Only difference is
>>> "filesource_id".  Version without EPT:// prefix has filesource_id=0, while
>>> with EPT:// prefix "filesource_id": 26982.
>>> ---------------------------
>>> Matthew Beckley
>>> Data Engineer
>>> UNAVCO/OpenTopography
>>> beckley at unavco.org
>>> cell: 301-982-9819
>>> _______________________________________________
>>> pdal mailing list
>>> pdal at lists.osgeo.org
>>> https://lists.osgeo.org/mailman/listinfo/pdal
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/pdal/attachments/20201215/4be75892/attachment-0001.html>


More information about the pdal mailing list