[Landsat-pds] Upcoming updates to Landsat on AWS data

Amit Kapadia amit at planet.com
Wed Nov 4 12:11:09 PST 2015


Jed,

Looks like we're missing 1001 scenes, or about 0.5% of scenes made
available in 2015. List of these scenes at:

https://gist.github.com/kapadia/f0562641d4fb01509d24

I'll get us going on ingesting these scenes.

Cheers,
Amit

On Fri, Oct 30, 2015 at 5:51 PM, Sundwall, Jed <jed at amazon.com> wrote:

> Amit, I don't have a list, but as I mentioned earlier on this thread, we
> have a customer who estimates that were missing about 2.4%. I can reach out
> to him to see if he has determined the IDs of those scenes.
>
> On Oct 29, 2015, at 5:16 PM, Amit Kapadia <amit at planet.com> wrote:
>
> Jed,
>
> Do you have a list of scenes that we're missing? If not, I can figure this
> out.
>
> Regarding errors, it's ups and downs. It's clear that the API is in
> constant development. Last week we hit more errors than usual. The service
> seems to have stabilized this week. It would be nice to chat with one of
> their developers to get a better understanding of the API and have someone
> to contact directly when we encounter bugs.
>
> Cheers,
> Amit
>
> On Thu, Oct 29, 2015 at 6:10 PM, Sundwall, Jed <jed at amazon.com> wrote:
>
>> Amit, this is great news! Before I reach out to USGS, should we use the
>> extra account to acquire any other scenes that we missed throughout the
>> year?
>>
>> Also, are we still getting a lot of errors?
>>
>>
>> On Oct 26, 2015, at 6:04 AM, Amit Kapadia <amit at planet.com> wrote:
>>
>> Good news. We've finished ingesting the ~53,000 reprocessed scenes.
>>
>> Jed - you can follow up with USGS to revoke the extra account.
>>
>> Cheers,
>> Amit
>>
>>
>>
>> On Fri, Oct 2, 2015 at 4:46 PM, Amit Kapadia <amit at planet.com> wrote:
>>
>>> Jed - I can't give a definitive answer, but I suspect we'll start to
>>> fall behind. I just checked our ingestion from September, and we're doing
>>> well. All images released in September were uploaded to S3 by Oct 1. To
>>> keep this pace, we do have a machine running all the time. Our ingestion
>>> job has started to fail about 1/3 of the time due to the new rate limiting.
>>> It would be nice to understand the full scope of these constraints. Ideally
>>> we'd be able to talk to one of the developers to better understand how best
>>> to operate.
>>>
>>> On Wed, Sep 30, 2015 at 12:50 PM, Sundwall, Jed <jed at amazon.com> wrote:
>>>
>>>> Thanks for the update, Amit. Is it possible that this new limit could
>>>> cause us to fall behind in acquiring all new scenes as they’re produced
>>>> each day?
>>>>
>>>> On Sep 30, 2015, at 12:00 PM, Amit Kapadia <amit at planet.com> wrote:
>>>>
>>>> Hey Jed,
>>>>
>>>> Thanks for reaching out to them. Looks like we have another
>>>> rate-limiting error to handle:
>>>>
>>>> usgs.USGSError: RATE_LIMIT: Rate limit exceeded - cannot support
>>>> simultaneous requests.
>>>>
>>>> According to the changelog of the USGS inventory service:
>>>>
>>>> August 2015
>>>>
>>>>  * Implemented single-stream rate limiting
>>>>  * Added FGDC Metadata URL to search and metadata responses
>>>>  * API Key is now required for all requests
>>>>
>>>> Despite the change being made in August, we're only now starting to see
>>>> this error. Previously, we were allowed 2 simultaneous downloads per
>>>> machine. This has been cut in half. To keep up with the flow of Landsat
>>>> scenes, we need simultaneous requests. This error is cropping up
>>>> periodically in our re-ingestion of the ~53,000 scenes, as well as our
>>>> daily ingestion.
>>>>
>>>> Enforcing single-stream per machine is a terrible waste of computing
>>>> resources.
>>>>
>>>> Also note, the need of an API key for all requests. Previously, anyone
>>>> was able to programmatically access metadata. This is no longer possible.
>>>>
>>>> Any help would be appreciated.
>>>>
>>>> Cheers,
>>>> Amit
>>>>
>>>> On Mon, Sep 28, 2015 at 3:38 PM, Sundwall, Jed <jed at amazon.com> wrote:
>>>>
>>>>> I’ve reached out to USGS to ask if we can increase the limit.
>>>>>
>>>>> Thanks for the update, Amit!
>>>>>
>>>>>
>>>>> On Sep 28, 2015, at 12:16 PM, Amit Kapadia <amit at planet.com> wrote:
>>>>>
>>>>> Another update on the reingestion of these ~53,000 scenes. We've
>>>>> moving along faster than the initial few weeks. Currently we have ~28,500
>>>>> scenes left to reprocess. This is taking a bit of time, mostly because USGS
>>>>> rate limits the number of scenes that can be simultaneously downloaded.
>>>>>
>>>>> Jed - we often hit an error of this sort:
>>>>>
>>>>> DOWNLOAD_RATE_LIMIT: User currently has more than 10 downloads that
>>>>> have not been attempted in the past 10 minutes.
>>>>>
>>>>> If there's a way we can work with USGS on getting this type of
>>>>> rate-limiting lifted, I'll be able to spin up additional workers, breaking
>>>>> through this 10 scene limit. No big deal if that's not possible.
>>>>>
>>>>> Cheers,
>>>>> Amit
>>>>>
>>>>> On Tue, Sep 15, 2015 at 10:58 AM, Amit Kapadia <amit at planet.com>
>>>>> wrote:
>>>>>
>>>>>> We're ingesting about 1.35 scenes per minute (~2000 scenes per day).
>>>>>> With 44,200 scenes remaining, this work should be complete in 22 - 23 days.
>>>>>>
>>>>>> The additional worker has kicked up the rate. I'm learning more about
>>>>>> the rate-limiting that USGS imposes, and it seems that a single machine is
>>>>>> limited to 2 concurrent downloads (we already knew this). However, we have
>>>>>> 3 machines running, so the rate-limiting appears to be a combination
>>>>>> between IP address and EROS account.
>>>>>>
>>>>>> Cheers,
>>>>>> Amit
>>>>>>
>>>>>>
>>>>>> On Mon, Sep 14, 2015 at 3:58 PM, Sundwall, Jed <jed at amazon.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Thanks for the update, Amit. Could you please let us know if you see
>>>>>>> that the extra workers have upped our rate? Also, if you can estimate when
>>>>>>> this would be done?
>>>>>>>
>>>>>>> Thank you very much for your work on this!
>>>>>>>
>>>>>>> Jed.
>>>>>>>
>>>>>>> On Sep 14, 2015, at 2:16 PM, Amit Kapadia <amit at planet.com> wrote:
>>>>>>>
>>>>>>> Hi all - an update to the ingestion of these reprocessed Landsat
>>>>>>> scenes. Using the additional bandwidth that Jed locked down, we've ingested
>>>>>>> ~8,000 of the 52,877 scenes. This has been moving a little slow, so I've
>>>>>>> bumped up the number of workers.
>>>>>>>
>>>>>>> In the past we've been restricted to 2 concurrent downloads from
>>>>>>> USGS servers, but it now seems that we're able to get 4 concurrent
>>>>>>> downloads. I'll try our luck with one more worker (2 more downloads) to see
>>>>>>> if we're allowed this luxury.
>>>>>>>
>>>>>>> Ingestion of new Landsat scenes continues as normal.
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Amit
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Aug 26, 2015 at 12:08 PM, Sundwall, Jed <jed at amazon.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Quick update:
>>>>>>>>
>>>>>>>> We have been granted additional bandwidth to acquire Landsat data
>>>>>>>> from EROS and will use it to reacquire 53,206 scenes that have been
>>>>>>>> reprocessed with updated TIRS data as described at
>>>>>>>> http://landsat.usgs.gov/calibration_notices.php
>>>>>>>>
>>>>>>>> We will also use this opportunity to check for any scenes that we
>>>>>>>> may have failed to acquire throughout 2015. Another user of the data
>>>>>>>> recently pointed out that "USGS states that there are 149307 scenes so far
>>>>>>>> in 2015, but AWS claims to host only 145746 of them. As a percentage, that
>>>>>>>> is 97.6% - IOW 2.4% are missing.” These scenes may be missing from the
>>>>>>>> bucket or they may merely be missing from the scene_list.gz file.
>>>>>>>>
>>>>>>>> I’ll update the list once the reacquisition is complete.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> Jed Sundwall – Open Data – Amazon Web Services
>>>>>>>>
>>>>>>>> cell: 801-949-1482
>>>>>>>> office: 206-435-3104
>>>>>>>>
>>>>>>>> https://aws.amazon.com/opendata/
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Landsat-pds mailing list
>>>>>>>> Landsat-pds at lists.osgeo.org
>>>>>>>> http://lists.osgeo.org/cgi-bin/mailman/listinfo/landsat-pds
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/landsat-pds/attachments/20151104/d48d9020/attachment.html>


More information about the Landsat-pds mailing list