I could have told you…


Short addition to the Sentinel-2 packaging and ESA API matter – three weeks ago i mentioned that

[…] Instead of browsing 200-300 packages per day you now have to deal with many thousands. This means the only feasible way to efficiently access Sentinel-2 data on a larger scale is now through automated tools.

And a few days ago:

Since ESA does not provide a bulk metadata download option you have to scrape the API for [obtaining bulk metadata].

Today the ESA announces that they think the recent API instability is largely caused by users attempting to retrieve extremely large numbers of results from queries and as a result they limit the maximum number of results returned by each query to 100. Three comments on that:

  • could have told them three weeks ago (maybe i should go into business as an oracle – although it really does not take a genius to predict that).
  • i doubt it will help much – it will create additional work for the programmers around the world to implement automatic followup queries to deal with the 100 rows limit – but in the end you will still have to serve out all the results, after all hardly anyone will do these queries for fun and there are now more than 160k entries to query – growing by about 4000-5000 every day (even though they now lag a week with publication of Sentinel-2 imagery). Systems like this which obviously do not scale to the level they are used at fail at the weakest link. But fixing or protecting this point of failure does not mean it will not fail at another place again. If the whole download framework goes down just because of a few unexpected but perfectly valid queries that is indication for a much more fundamental design problem.
  • even if it is so last century since today we all live in the cloud – offering bulk metadata download with a kind of daily/hourly diffs for updates might not be such a bad idea.

By the way (in case you wonder) – it was not me, when writing my scripts for creating the coverage analysis i was already working with a hundred rows limit – this was already the documented limit for the OData API anyway so it seemed appropriate to also use it in general.

Leave a Reply

Required fields are marked *.

By submitting your comment you agree to the privacy policy and agree to the information you provide (except for the email address) to be published on this blog.