New paper ! An active learning cloud detection tool to generate reference cloud masks for Sentinel-2. Application to the validation of MAJA, Sen2cor and FMask cloud masks

Example of reference cloud mask generated by ALCD, and comparison with the cloud masks generated by three operational processors (Sen2cor, FMask and MAJA). True positive invalid pixels appear in blue, true negative in green, false negative in red and false positive in purple..

It is not that frequent when the work of a trainee ends up as a peer reviewed publication, but Louis Baetens was a brilliant trainee. In a six months training period at CESBIO, funded by CNES, here is what Louis Baetens did:

  • developed an active learning method to generate reference cloud masks for Sentinel-2, using multi-temporal data as input
  • validated the quality of the produced masks (around 99% overall accuracy)
  • generated cloud and shadow masks covering 32 entire Sentinel-2 images
  • produced these same scenes with Sen2cor 2.5.5, FMask 4.0 and MAJA 3.3
  • evaluated the results using ALCD masks
  • wrote a report and a user manual for ALCD
  • released the masks and tools on open access platforms
  • And wrote (with Camille and myself) a scientific publication


The publication was just released by remote sensing :

Baetens, L.; Desjardins, C.; Hagolle, O. Validation of Copernicus Sentinel-2 Cloud Masks Obtained from MAJA, Sen2Cor, and FMask Processors Using Reference Cloud Masks Generated with a Supervised Active Learning Procedure. Remote Sens. 2019, 11, 433.


The remaining of the post provides a plain language summary (but it's better to read the paper !)

Continue reading

A pause in MUSCATE production, end of February

The MUSCATE production centre will be offline for a week from February 25th to March 4th (or maybe the week after, please see Theia's RSS feed to see the exact date). This pause will be necessary to upgrade the processing centre to V2.5. As a result, we will not be able, for a week, to process Sentinel-2 data in real time, and hope it will not be too inconvenient to you. In case you need some data urgently, you can ask PEPS to produce MAJA L2A data for you during that period.


The main change regards the internal format used by MUSCATE for Sentinel-2 Level 2A products. This internal format is different from the external format that we distribute, and this results in unnecessary product format conversions, as well as the necessity to develop drivers for the various formats (internal and external) for the processors that use Sentinel-2 L2A data within MUSCATE. To enable this modification, it will be necessary to convert all the L2A data from the internal format to the external format, which will take a whole week.


MUSCATE V2.5 will bring other improvements, such as the integration of MAJA 3.1, with possibility to use Copernicus Atmosphere data, or a new version of LIS, the processor that delivers the snow maps.

MUSCATE  V2.6 is also ready and queuing to be installed, with MAJA 3.2, WASP and the possibility to process Venµs L2A data within MUSCATE and not externally on Venµs ground segent.




Diffusion d'un premier lot des données Sentinel-2A de niveau 2A sur le Sahel

Il y a quelques semaines, nous annoncions la sélection d'une nouvelle zone de production de données Sentinel-2 au niveau 2A par Theia, au Sahel. La production a démarré, et Theia a déjà produit les tuiles de la zone UTM28 (à l'ouest). Les tuiles en vert foncé existaient déjà, mais nous avons rajouté celles en vert clair, qui permettent de couvrir l'ensemble du Sénégal, la Gambie,  une partie de la Guinée Bissau, de la Guinée, et le nord de la Sierra Leone.


Les données disponibles ont été traitées du premier janvier 2017 à hier, soit plus de deux ans de données. Les nouvelles données seront maintenant traitées en temps réel au fur et à mesure de leur arrivée.

Nous procéderons de même avec les différentes zones  de l'ouest vers l'est : UTM29, UTM30...


S1Tiling : ortho-rectification à la demande des données Sentinel-1 sur la grille Sentinel-2


​Sentinel-1 est actuellement le seul système à fournir des images SAR régulièrement sur toutes les terres de la planète. L'accès à ces séries temporelles d'images ouvre un champ d'application hors du commun.

Afin de répondre aux besoins d'un grand nombre d'utilisateurs, dont les nôtres, nous avons créé une chaîne de traitement automatique permettant de générer des séries temporelles "prêtes à l'emploi" pour un très grand nombre d'applications. Les données Sentinel-1 sont ortho-rectifiées sur la grille Sentinel-2 pour favoriser l'usage conjoint des deux missions.

Improvement of water vapour retrieval in MAJA

Similarly to the aerosol retrieval, the retrieval of water vapour in MAJA atmospheric correction has also been improved, thanks to the work of Elsa Bourgeois (Cap Gemini) and Camille Desjardins (CNES). An accurate estimation of water vapour is not necessary to perform an accurate atmospheric correction, because water vapour absorption in most of Sentinel-2 bands is much lower than 5%. But the Sentinel-2 water vapour product could also prove useful, and when we plot validation results, showing a large bias for high water vapour contents is not nice.




Here is the kind of results we have been having with MAJA from the beginning, with a large bias when water vapour content is high :

Our very simple method uses the ratio between Sentinel-2 B9 and B8a bands to estimate the water vapour. B9 is located within a water vapour absorption band at 940 nm, while B8a serves as reference and is only moderately affected by water vapour. The ratio is converted thanks to the use of a Look-up table, which is obtained using radiative transfer calculations. Our method assumes that the water vapour is above the scattering layer, which is obviously not true. The errors due to this assumption increase with the amount of water vapour.


Elsa and Camille just empirically computed a new water vapour LUT to cancel this bias, and it works! As you can see, the RMS errors have been divided by a factor 2, from 0.2 g/cm2 to 0.1 g/cm2.

We will put this new parameter set in production in January within Theia, and make it available to the users of MAJA processor.





MAJA 3.1.2 with CAMS option finally validated

We had announced quite a long time ago the coming availability of MAJA 3.1 to correct for the atmospheric effects on Sentinel-2, Landsat 8 or Venµs satellites. This version brings a significant improvement in the estimation of Aerosol Optical Thickness, thanks to the use of Copernicus Atmosphere Monitoring Service (CAMS) data to constrain the aerosol type. The details of the methodscan be found here. Bastien Rouquié obtained them on our python prototype of MAJA.


We then implemented them in the operational and fast version of MAJA. If the validation tests of MAJA 3.1 were correct on the two test products we had defined, a large scale validation using 10 sites over two year time series showed that instead of improving, using the CAMS option was degrading the results. We had to search for the cause (a bad interpolation of CAMS data in space and time), and correct the errors and perform again a large validation.


This time, the validation results are improving a lot, as it may be seen on the figures below.

Without CAMS option With CAMS option

On the left column, we provide the results without activating CAMS option, while on the right, it is activated. The top row corresponds to the comparison between Aeronet AOT used as reference, and MAJA AOT, for eight sites in diverse landscapes. The bottom row provide an example on the well known validation site in Mongu, Zambia.The blue dots correspond to good quality aerosol measurements (no clouds, level 2.0 aeronet values), while red dots correspond to degraded conditions (with either clouds or not quality assured aeronet data (level 1.5 data)


Using CAMS to constrain the aerosol type improves the results by 25%, compared to the use of a continental aerosol model everywhere. Errors for the quality assured validation pixels decrease from 0.085 to 0.065 on the 8 sites, and from 0.143 to 0.094 on Mongu site in Zambia. This site has various types of aerosols depending on the season, including dust, biomass burning and continental aerosols. The results are still far from perfect, and we have work for the next 5 years, but it is still good to have them improved !


MAJA 3.1.2 is available starting from this link on github, as an executable program for linux. To be allowed to use it, you will have to sign the licence first, from this site.  If you want to use it for commercial applications, you should ask for a special licence (still for free), sending me an email. In January, I will provide the parameters to allow activate the CAMS options.


Regarding the production of Theia, our ground segment has been adapted to use MAJA version 3.1.2, and will soon be able to fetch the CAMS products from Copernicus Atmosphere. Then we will have an operational qualification phase, to check that we can download CAMS products in time for real time production. We should be able to start using in in February or March.  And after a few months, if the results are good, yoohoo, we will reprocess everything !


Many thanks to Bastien Rouquié, CESBIO, who did the scientific work, to Camille Desjardins w ho helpled with the validation, to Aurelien Bricier and Benjamin Esquis, at CS-SI for coding the operational version, and Peter Kettig (CNES) and Bruno Angeniol (Cap Gemini), and Bastien, for checking the consistency between prototype and operational versions.




THEIA : Une nouvelle zone de production des données Sentinel-2 L2A au Sahel


La production des données Sentinel-2 au niveau 2A, avec la chaîne MAJA, par le centre MUSCATE de Theia est maintenant rapide et efficace, nous disposons donc de quelques marges pour ajouter de nouvelles zones (les zones actuelles sont visibles ici). Les organismes français intéressés par l'ajout de nouvelles zones peuvent d'ailleurs me contacter, il devrait nous rester une petite cinquantaine de tuiles à choisir, compte tenu des demandes déjà reçues.


Certains collègues bien informés ont devancé cet appel, et parmi eux, Santiago Pena Luque,  du CNES, qui travaille pour le projet SWOT. Dans le cadre de la préparation du projet, deux grands bassin fluviaux africains ont été choisis pour concentrer les premières expérimentations. Il s'agit des bassins du Niger et du Sénégal. Les applications concerneront le suivi des cours d'eau bien sûr, mais aussi de l'occupation des sols et de la dynamique de la végétation. Plusieurs laboratoires y seront impliqués, dont le GET, le CESBIO, le LEGOS à Toulouse, le laboratoire TETIS à Montpellier.

Voici donc la zone acceptée par Theia :


Nouvelle zone de traitement Sentinel-2 sur le Sahel . En vert les tuiles déjà traitées, en bleu, les tuiles que nous allons ajouter.

La zone a été établie à partir des contraintes suivantes :
- ne pas dépasser 300 tuiles
- proposer une zone contiguë
- éviter les zones les plus nuageuses, comme les côtes du golfe de Guinée
- couvrir la quasi totalité des bassins du Sénégal et du Niger (à l'exception des zones systématiquement nuageuses ou complètement désertiques)
- compléter éventuellement des zones administratives connexes: nous avons pu par exemple couvrir la totalité du Sénégal, de la Gambie, du Burkina Faso, tout l'Ouest et le Sud du Mali, le Nord des guinées, de la Côte d'Ivoire, du Bénin et du Nigeria, le Sud du Niger et l'Ouest du Tchad.


La production va démarrer très prochainement, avec les données acquises en décembre 2016. Il nous faudra probablement quelques mois pour rejoindre le traitement en temps réel. Celà va donner beaucoup de travail à nos équipes de production, mais c'est pour la bonne cause.


Sentinel-2 Level3A time series (July, August, September 2018)

If you are not afraid to spend too much time while you have urgent things to do, you may have a look to the mosaic of Sentinel-2 monthly syntheses for September over France. You may access to each monthly synthesis using the following links :

Or you may also use the viewer below to compare with the previous months and see how France became brown in September :

See it full screen

The monthly syntheses are produced using the WASP processor, which is described here.

By comparing the various syntheses, you will see the evolution of the landscape, generally much brownler in September, but this representation will also help you spot the composite artefacts. These are not very numerous, but you will see them :

  • on some web browsers (firefox V58), geometrical differences appear even at a low resolution. Other browsers and versions do not have this defect. It is really not due to Sentinel-2 or Theia products
  • above water and snow (we must work on this defect)
  • where clouds have covered a place during the whole month of July or August. These pixels are flagged as invalid in the products (but not on the mosaic).
  • where clouds or shadows were not properly detected by MAJA
  • at the edges of Sentinel-2 swath. For the first time, in october, a swath edge is clearly visible near Cambrai. The area must have been quite cloudy, and we observe here a greener part on the right, observed later in October, that the browner part on the left. The only way to correct this kind of atefact while keeping a physical meaning to the reflectances, would be to improve Sentinel-2 revisit time
  • some tile edges in July, due to the fact that Level 3A products were not all generated for the 15th of July, but for dates between the 8th and the 26th. This has been corrected for the next months


[Nouvelles de MUSCATE] Un été très productif


Mise à jour le 4 octobre 2018

Cet été, alors que la plupart d'entre nous prenions des coups de soleil sur la plage, nous faisions dévorer par des moustiques, attrapions des courbatures en rando, ou faisions de longues siestes pour nous remettre de nuits fatigantes, les équipes de production et de distribution de MUSCATE ont eu la chance de bénéficier d'une climatisation efficace, de fauteuils confortables, d'ordinateur rapides et de l'accès à la meilleure cantine du monde au CNES. Il n'est donc pas surprenant que la production ait fait de gros progrès, mais nous pouvons néanmoins leur adresser un grand merci, car les résultats sont impressionnants.


Amélioration des performances de production

Theia's L2A counter reached 100 000 images on August 12th


Tout d'abord, l'anomalie qui avait perturbé la production à la fin du printemps a été résolue. C'était un problème de capacité de mémoire sur la machine qui héberge le catalogue interne de MUSCATE. Petit à petit, la besoin de mémoire dépassait le volume prévu, et les performances se dégradaient avant que le système ne plante, quasiment tous les jours. Il a suffi de quelques ajustement et d'un doublement de la mémoire pour que le problème disparaisse.

Ainsi, depuis début juillet, MUSCATE n'a plus connu que deux courtes interruptions dues à une phase de maintenance du centre informatique du CNES. Comme on peut le voir sur la figure qui suit,  la courbe orange, qui moyenne la production sur un mois n'a jamais été aussi haute, depuis le démarrage de la production.


Number of L2A produced each day by MUSCATE (after removing the products with more than 90% of clouds)

Continue reading