Clouds belong to optical satellite images like manure belongs to horses – they are inevitably connected but still we generally prefer the latter without the former and put a lot of effort into separating the two things.
So even if clouds in satellite images can have their own beauty or in some cases can even be the reason why the image is taken (i.e. weather satellites) in most cases the ideal image is without clouds and any presence of clouds is an undesirable flaw. So when satellite images are assembled, especially if for visualization purposes, the absence of visual traces of clouds is generally the aim and even a relatively rare occurrence of clouds tends to be a prominent deficit. A lot of satellite image products you see in various places are made with either an explicit claim of being cloud-free or at least with an implied statement along these lines.
But how successful are these attempts of cloud elimination and avoidance in reality? Interestingly i have never seen any hard numbers being specified regarding the lack of clouds so lets have a closer look.
At any given point of time more than half of the earth surface is covered with clouds. Locally this varies significantly over the course of a day with a minimum over land in the morning. Since most earth observation satellite take pictures in the morning and clouds are less frequent over land than over ocean the average cloud cover in satellite images taken without cloud cover based scheduling is probably about 40 percent. With optimized scheduling this improves further – the average automatically assessed cloud cover in Landsat 8 scenes for example is about 34 percent – although to be fair it needs to be pointed out that these are not uniformly distributed globally.
The most widely used optical satellite image product is probably the Blue Marble next generation i also mentioned several times here in the past. This is produced with the clear aim to be cloud free but the producers also acknowledged that this was not 100 percent successful. Looking at the July version and doing some pixel counting i concluded that there are probably around 75000 pixel in the image that can be considered to be severely affected by clouds – due to the way these images are produced these are either bright pixels or extremely noisy areas with hardly any color information being discernible below the noise. This means that about one in 50000 pixels in these images is severely affected by clouds.
My own global color mosaic, the Green Marble, is advertised as 100% cloud free. If at all you can only observe a slight increase in noise levels in some areas as a result of clouds. Conservatively seen i would say there is less than one in a million pixels severely affected by clouds.
Comparing these with other satellite image mosaics is somewhat problematic since most of them are quite far from offering truly global coverage and the difficulty of cloud avoidance and elimination varies a lot depending on where on Earth you look. But lets have a look at images with near global coverage of more than 2/3 of the Earth land surface.
At the other end of the scale in terms of cloud-freeness you have the legacy Landsat mosaics based on 1999-2003 data where clouds are minimized purely by scene selection. These mosaics are still used in many web map services, like from Bing, Mapbox, Esri and Yandex at the intermediate zoom levels. How much clouds are present in these mosaics varies a bit depending on scene selection but overall i would estimate the abundance of severely cloud affected pixels to be no better than about one in 2000 pixels. This is quite hard to measure precisely because it varies so much depending on where you look, in the tropics it can be as bad as one in 10 pixels.
The only other near-global Landsat mosaic currently in widespread use is the Google mosaic i reviewed when it was introduced. This is fairly good in cloud elimination although artefacts due to clouds are often difficult to distinguish from other errors. I would rate this around the same level as the Blue Marble gext generation.
Then there is the Mapbox cloudless atlas image which has the aim of cloudlessness right in the name. This is fairly good but clearly somewhat worse than the BMNG – i would put it at around one in 20000 pixels.
For comparison here all these assessments together in a table:
Image | Date | Resolution [m] | inverse cloud fraction |
---|---|---|---|
Green Marble | 2011-2013 | 250 | >1M |
BMNG | 2004 | 500 | ~50k |
Google Landsat Mosaic | 2004-2013 | 15 | ~50k |
Mapbox Cloudless Atlas | 2012 | 250 | ~20k |
Legacy Landsat Mosaics | 1999-2003 | 15 | <2000 |
Raw satellite images | any | any | ~2.5-3 |
Now where do my own local Landsat mosaics stand in that regard? My aim is to have no more than about one in 100k pixels with significant cloud influence in these. The actual occurrence of clouds is probably often significantly less but this is actually difficult to properly measure since counting cloud pixels at this rate would be about as much work as eliminating them. The small fraction of clouds that is still present in these is usually either clouds above snow in mountain areas or very small convective clouds over relatively flat land. Both these cases are hard to reliably identify in images both by automatic cloud classification and manual inspection.
But of course clouds are only one of many things that influence the quality of a satellite image mosaic. It is not a good idea to get too hung up in these numbers therefore. But it is good to to know the approximate relations.