[Landsat-pds] Upcoming updates to Landsat on AWS data

Amit Kapadia amit at planet.com
Wed Nov 4 12:14:04 PST 2015


Oh, I should mentioned that many of these were initially released at OLI
only scenes, then later released as OLI + TIRS. The difference in prefix
(LO8 vs LC8) may be why these were missed.

On Wed, Nov 4, 2015 at 12:11 PM, Amit Kapadia <amit at planet.com> wrote:

> Jed,
>
> Looks like we're missing 1001 scenes, or about 0.5% of scenes made
> available in 2015. List of these scenes at:
>
> https://gist.github.com/kapadia/f0562641d4fb01509d24
>
> I'll get us going on ingesting these scenes.
>
> Cheers,
> Amit
>
> On Fri, Oct 30, 2015 at 5:51 PM, Sundwall, Jed <jed at amazon.com> wrote:
>
>> Amit, I don't have a list, but as I mentioned earlier on this thread, we
>> have a customer who estimates that were missing about 2.4%. I can reach out
>> to him to see if he has determined the IDs of those scenes.
>>
>> On Oct 29, 2015, at 5:16 PM, Amit Kapadia <amit at planet.com> wrote:
>>
>> Jed,
>>
>> Do you have a list of scenes that we're missing? If not, I can figure
>> this out.
>>
>> Regarding errors, it's ups and downs. It's clear that the API is in
>> constant development. Last week we hit more errors than usual. The service
>> seems to have stabilized this week. It would be nice to chat with one of
>> their developers to get a better understanding of the API and have someone
>> to contact directly when we encounter bugs.
>>
>> Cheers,
>> Amit
>>
>> On Thu, Oct 29, 2015 at 6:10 PM, Sundwall, Jed <jed at amazon.com> wrote:
>>
>>> Amit, this is great news! Before I reach out to USGS, should we use the
>>> extra account to acquire any other scenes that we missed throughout the
>>> year?
>>>
>>> Also, are we still getting a lot of errors?
>>>
>>>
>>> On Oct 26, 2015, at 6:04 AM, Amit Kapadia <amit at planet.com> wrote:
>>>
>>> Good news. We've finished ingesting the ~53,000 reprocessed scenes.
>>>
>>> Jed - you can follow up with USGS to revoke the extra account.
>>>
>>> Cheers,
>>> Amit
>>>
>>>
>>>
>>> On Fri, Oct 2, 2015 at 4:46 PM, Amit Kapadia <amit at planet.com> wrote:
>>>
>>>> Jed - I can't give a definitive answer, but I suspect we'll start to
>>>> fall behind. I just checked our ingestion from September, and we're doing
>>>> well. All images released in September were uploaded to S3 by Oct 1. To
>>>> keep this pace, we do have a machine running all the time. Our ingestion
>>>> job has started to fail about 1/3 of the time due to the new rate limiting.
>>>> It would be nice to understand the full scope of these constraints. Ideally
>>>> we'd be able to talk to one of the developers to better understand how best
>>>> to operate.
>>>>
>>>> On Wed, Sep 30, 2015 at 12:50 PM, Sundwall, Jed <jed at amazon.com> wrote:
>>>>
>>>>> Thanks for the update, Amit. Is it possible that this new limit could
>>>>> cause us to fall behind in acquiring all new scenes as they’re produced
>>>>> each day?
>>>>>
>>>>> On Sep 30, 2015, at 12:00 PM, Amit Kapadia <amit at planet.com> wrote:
>>>>>
>>>>> Hey Jed,
>>>>>
>>>>> Thanks for reaching out to them. Looks like we have another
>>>>> rate-limiting error to handle:
>>>>>
>>>>> usgs.USGSError: RATE_LIMIT: Rate limit exceeded - cannot support
>>>>> simultaneous requests.
>>>>>
>>>>> According to the changelog of the USGS inventory service:
>>>>>
>>>>> August 2015
>>>>>
>>>>>  * Implemented single-stream rate limiting
>>>>>  * Added FGDC Metadata URL to search and metadata responses
>>>>>  * API Key is now required for all requests
>>>>>
>>>>> Despite the change being made in August, we're only now starting to
>>>>> see this error. Previously, we were allowed 2 simultaneous downloads per
>>>>> machine. This has been cut in half. To keep up with the flow of Landsat
>>>>> scenes, we need simultaneous requests. This error is cropping up
>>>>> periodically in our re-ingestion of the ~53,000 scenes, as well as our
>>>>> daily ingestion.
>>>>>
>>>>> Enforcing single-stream per machine is a terrible waste of computing
>>>>> resources.
>>>>>
>>>>> Also note, the need of an API key for all requests. Previously, anyone
>>>>> was able to programmatically access metadata. This is no longer possible.
>>>>>
>>>>> Any help would be appreciated.
>>>>>
>>>>> Cheers,
>>>>> Amit
>>>>>
>>>>> On Mon, Sep 28, 2015 at 3:38 PM, Sundwall, Jed <jed at amazon.com> wrote:
>>>>>
>>>>>> I’ve reached out to USGS to ask if we can increase the limit.
>>>>>>
>>>>>> Thanks for the update, Amit!
>>>>>>
>>>>>>
>>>>>> On Sep 28, 2015, at 12:16 PM, Amit Kapadia <amit at planet.com> wrote:
>>>>>>
>>>>>> Another update on the reingestion of these ~53,000 scenes. We've
>>>>>> moving along faster than the initial few weeks. Currently we have ~28,500
>>>>>> scenes left to reprocess. This is taking a bit of time, mostly because USGS
>>>>>> rate limits the number of scenes that can be simultaneously downloaded.
>>>>>>
>>>>>> Jed - we often hit an error of this sort:
>>>>>>
>>>>>> DOWNLOAD_RATE_LIMIT: User currently has more than 10 downloads that
>>>>>> have not been attempted in the past 10 minutes.
>>>>>>
>>>>>> If there's a way we can work with USGS on getting this type of
>>>>>> rate-limiting lifted, I'll be able to spin up additional workers, breaking
>>>>>> through this 10 scene limit. No big deal if that's not possible.
>>>>>>
>>>>>> Cheers,
>>>>>> Amit
>>>>>>
>>>>>> On Tue, Sep 15, 2015 at 10:58 AM, Amit Kapadia <amit at planet.com>
>>>>>> wrote:
>>>>>>
>>>>>>> We're ingesting about 1.35 scenes per minute (~2000 scenes per day).
>>>>>>> With 44,200 scenes remaining, this work should be complete in 22 - 23 days.
>>>>>>>
>>>>>>> The additional worker has kicked up the rate. I'm learning more
>>>>>>> about the rate-limiting that USGS imposes, and it seems that a single
>>>>>>> machine is limited to 2 concurrent downloads (we already knew this).
>>>>>>> However, we have 3 machines running, so the rate-limiting appears to be a
>>>>>>> combination between IP address and EROS account.
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Amit
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Sep 14, 2015 at 3:58 PM, Sundwall, Jed <jed at amazon.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Thanks for the update, Amit. Could you please let us know if you
>>>>>>>> see that the extra workers have upped our rate? Also, if you can estimate
>>>>>>>> when this would be done?
>>>>>>>>
>>>>>>>> Thank you very much for your work on this!
>>>>>>>>
>>>>>>>> Jed.
>>>>>>>>
>>>>>>>> On Sep 14, 2015, at 2:16 PM, Amit Kapadia <amit at planet.com> wrote:
>>>>>>>>
>>>>>>>> Hi all - an update to the ingestion of these reprocessed Landsat
>>>>>>>> scenes. Using the additional bandwidth that Jed locked down, we've ingested
>>>>>>>> ~8,000 of the 52,877 scenes. This has been moving a little slow, so I've
>>>>>>>> bumped up the number of workers.
>>>>>>>>
>>>>>>>> In the past we've been restricted to 2 concurrent downloads from
>>>>>>>> USGS servers, but it now seems that we're able to get 4 concurrent
>>>>>>>> downloads. I'll try our luck with one more worker (2 more downloads) to see
>>>>>>>> if we're allowed this luxury.
>>>>>>>>
>>>>>>>> Ingestion of new Landsat scenes continues as normal.
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> Amit
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Aug 26, 2015 at 12:08 PM, Sundwall, Jed <jed at amazon.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Quick update:
>>>>>>>>>
>>>>>>>>> We have been granted additional bandwidth to acquire Landsat data
>>>>>>>>> from EROS and will use it to reacquire 53,206 scenes that have been
>>>>>>>>> reprocessed with updated TIRS data as described at
>>>>>>>>> http://landsat.usgs.gov/calibration_notices.php
>>>>>>>>>
>>>>>>>>> We will also use this opportunity to check for any scenes that we
>>>>>>>>> may have failed to acquire throughout 2015. Another user of the data
>>>>>>>>> recently pointed out that "USGS states that there are 149307 scenes so far
>>>>>>>>> in 2015, but AWS claims to host only 145746 of them. As a percentage, that
>>>>>>>>> is 97.6% - IOW 2.4% are missing.” These scenes may be missing from the
>>>>>>>>> bucket or they may merely be missing from the scene_list.gz file.
>>>>>>>>>
>>>>>>>>> I’ll update the list once the reacquisition is complete.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>> Jed Sundwall – Open Data – Amazon Web Services
>>>>>>>>>
>>>>>>>>> cell: 801-949-1482
>>>>>>>>> office: 206-435-3104
>>>>>>>>>
>>>>>>>>> https://aws.amazon.com/opendata/
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> Landsat-pds mailing list
>>>>>>>>> Landsat-pds at lists.osgeo.org
>>>>>>>>> http://lists.osgeo.org/cgi-bin/mailman/listinfo/landsat-pds
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/landsat-pds/attachments/20151104/f5505474/attachment-0001.html>


More information about the Landsat-pds mailing list