The THEIA land data center just started processing LANDSAT 8 Level 2A products

Mosaic of LANDSAT (here, 5 & 7) data produced at CESBIO, from both ESA and USGS data. These data are cut in 110 x 110 km² tiles, each tile has a 10 km overlap with its neighbors. For each tile, each LANDSAT acquisition with at least a little clear sky corner is provided.

At the beginning of the week, the MUSCATE prototype processing center of THEIA started processing the LANDSAT 8 data available in France. The processing started with the 2013 data, which will be transformed into Level 2A products. As for SPOT4 (Take5), the level 2A products are expressed in surface reflectance after atmospheric correction, and are provided with a cloud mask, a cloud shadows mask, a water and snow mask.but in the case of LANDSAT 8, the products are split into tiles on a 100*100 km² grid, and each tile is 110*110 km² to allow an overlap of 10 km between tiles.

 

Landsat 8 data should progressively appear on THEIA's catalog in less than a month (but this is a risky assertion, as it is the first time we do this production and surprises may arise, although we spent a lot of time in validation). More details are available here.

 

 

Comparison of Level 3A compositing methods

=>

As said in a previous post, we are testing various methods of level 3A production, using SPOT4 (Take 5). The Theia Land Data Center will the use these methods to process Sentinel 2 data. In case you did not click on the link above, let's recall that the level 3A products are monthly composite products of cloud free reflectances. For each pixel, our method computes the weighted average of the reflectances of the dates when the pixel is cloud free. For more details, you will need to follow this link.

 

The work of Mohamed Kadiri at CESBIO, which is funded by the CNES budget for Theia, adressed first the definition of quality indexes for composite products (for more details, may I suggest that you follow this link ?). This work showed that our product has nice performance, but we knew some one would ask us to compare them to the classical methods for level 3A products.

 

Therefore, we compared our product with the famous NDVI Maximum Value Composite (NDVI MVC), developped by our remote sensing ancestors, and used since the most remote antiquity to process AVHRR time series. This method consists in using for each pixel of the level 3A, the reflectances of the date which has the greatest NDVI.  Why ? Mostly because the NDVI of a cloud is very low, often negative, and therefore this method will rather select cloud free pixels. The NDVI MVC comes from a time when the cloud masks were not very accurate.

 

Example of a monthly synthesis obtained with the NDVI MVC methods Example of a monthly synthesis obtained with the weighted average method

This post uses the SPOT4-Take5 data to show a comparison of the performances obtained on the Versailles site, with the NDVI MVC method on the left, and the weighted average on the right. One can clearly see, on the left, the presence artefacts made of whiter and darker dots which are not seen on the image on the right. These artefacts appear when the selected date changes from one pixel to the other. These artefacts are much less visible on the vegetation covered plots, as, for this composite obtained in spring, the vegetation increases quickly, and all the pixels come from the last cloud free date of the synthesis.

 

If we have a look at our quality indicators, which were described in our previous post about composite products , it is obvious that the performances obtained by the weighted average method are much better than those of the NDVI MVC method, either as regards the similarity to the central date image of the Level 3A (in yellow, for the 70 % best pixels and in green for the 95% best pixels), and moreover as regards as the amplitude of artefacts (in blue). The abscissa of the plot is the half of the number of days used in the synthesis, and our recommended value is 21.

 

 

NDVI Maximum Value Composite Weighted Average Composite

Land cover maps quickly obtained using SPOT4 (Take5) data for the Sudmipy site

=>

At CESBIO, we are developing land cover map production techniques, for high resolution image time series, similar to those which will soon be provided by Venµs and Sentinel-2. As soon as the SPOT4 (Take5) data were available over our study area (Sudmipy site in South West France), we decided to assess our processing chains on those data sets. The first results were quickly presented during Take5 user's meeting which was held last October.

1. Experiments

In this post we describe the work carried out in order to produce these first land cover classifications with the SPOT4 (Take5) Sudmipy images (East and West areas) and we compare the results obtained over the common region to these two areas.

 

Prior to the work presented here, we organized a field data collection campaign which was synchronous to the satellite acquisitions. These data are needed to train the classifier training and validate the classification. The field work was conducted in 3 study areas (figure 1) which were visited 6 times between February and September 2013, and corresponded to a total of 2000 agricultural plots. This allowed to monitor the cultural cycle of Winter crops, Summer crops and their irrigation attribute, grasslands, forests and bulit-up areas. The final nomenclature consists in 16 land cover classes.

 

The goal was to assess the results of a classification using limited field data in terms of quantity but also in terms of spatial spread. We wanted also to check whether the East and West SPOT4 (Take5) tracks could be merged. To this end, we used the field data collected on the common area of the two tracks (in pink on the figure) and 5 level 2A images for each track acquired with a one day shift.

 

OUEST EST
2013-02-16
2013-02-21
2013-03-03
2013-04-17
2013-06-06
2013-02-17
2013-02-22
2013-03-04
2013-04-13
2013-06-07
2. Results

The first results of supervised SVM classification (using the ORFEO Toolbox) can be considered as very ipromising, since they allow to obtain more than 90% of correctly classified pixels for both the East and the West tracks and since the continuity between the two swaths is excellent. Some confusions can be observed between bare soils or mineral surfaces and Summer crops, but these errors should be reduced by using LANDSAT 8 images acquired during the Summer, when Summer crops will develop.

Merging of the land cover maps obtained on the East and West Sudmipy tracks (the cloudy areas were cropped out). The comparison against the ground truth (the black dots on the map to the South-West of Toulouse) results in a kappa coefficient of 0.89 for the West and 0.92 on the East.

 

West EAST

This zoom compares the results obtained on the common area of the two tracks (West to the left and East to the right). The two classifications were obtained independently, using the same method and the same training data, but with images acquired at different dates and with different viewing angles. The main errors are maize plots labeled as bare soil, which is not surprising, since this crop was just emerging when the last image was acquired. There are also confusions between wheat and barley, but even on the field, one has to be a specialist to tell them apart.


3. Feedback and retrospective

After performing these experiments, we were very satisfied with the operationnality of our tools. Given the data volume to be processed (about 10 GB of images) we could have expected very long computation times or a limitation in terms of memory limits of the software used (after all, we are just scientists in a lab!). You will not be surprised to know that our processing chains are based on Orfeo Toolbox. More precisely, the core of the chain uses the applications provided with OTB for supervised training and image classification. One just have to build a multi-channel image were each channel is a classification feature (reflectances, NDVI, etc.) and provide a vector data (a shapefile, for instance) containing the training (and validation) data. Then, a command line for the training (see the end of this post) and another one for the classification (idem) are enough.

Computation times are very interesting: several minutes for the training and several tens of minutes for the classification. One big advantage of OTB applications is that they automatically use all the available processors automatically (our server has 24 cores, but any off the shelf PC has between 4 and 12 cores nowadays!).

We are going to continue using these data, since we have other field data which are better spread over the area. This should allow us to obtain even better results. We will also use the Summer LANDSAT 8 images in order to avoid the above-mentioned errors on Summer crops.

4. Command line examples

We start by building a multi-channel image with the SPOT4 (Take5) data, not accounting for the cloud masks in this example :

otbcli_ConcatenateImages -il SPOT4_HRVIR_XS_20130217_N1_TUILE_CSudmipyE.TIF
SPOT4_HRVIR_XS_20130222_N1_TUILE_CSudmipyE.TIF
SPOT4_HRVIR_XS_20130304_N1_TUILE_CSudmipyE.TIF
SPOT4_HRVIR_XS_20130413_N1_TUILE_CSudmipyE.TIF
SPOT4_HRVIR_XS_20130607_N1_TUILE_CSudmipyE.TIF -out
otbConcatImg_Spot4_Take5_5dat2013.tif

We compute the statistics of the images in order to normalize the features :

otbcli_ComputeImagesStatistics -il otbConcatImg_Spot4_Take5_5dat2013.tif -out
EstimateImageStatistics_Take5_5dat2013.xml

We train a SVM with an RBF (Gaussian) kernel :

otbcli_TrainSVMImagesClassifier -io.il otbConcatImg_Spot4_Take5_5dat2013.tif
-io.vd DT2013_Take5_CNES_1002_Erod_Perm_Dissolve16cl.shp -sample.vfn "Class"
-io.imstat EstimateImageStatistics_Take5_5dat2013.xml -svm.opt 1 -svm.k rbf
-io.out svmModel_Take5Est_5dat2013_train6.svm

And Voilà !, we perform the classification:

otbcli_ImageSVMClassifier -in otbConcatImg_Spot4_Take5_5dat2013.tif -mask
EmpriseTake5_CnesAll.tif -imstat EstimateImageStatistics_Take5_5dat2013.xml
-svm svmModel_Take5Est_5dat2013_train_6.svm -out ClasSVMTake5_5dat_16cl_6.tif

Systematic or On-Demand acquisitions ?

Example of Pleiades (CNES) acquisition plan. Among the sites requested, only those linked to the track by a yellow line are observed from this overpass.

=>

This post is an old one (last year), but I had not translated it.

 

The satellites observing the Earth at a high resolution may be divided in two categories according to their programming mode :

  • Satellites with On Demand Acquisition (SODA) :

Users ask the provider to program an image above their site. The provider collects all demands and optimises the acquisition plan so that a maximum of user requests are satisfied. The provider often charges an extra cost if the user needs an image at a precise date, and in zones where satellite image demand is high, a user is never sure to get the image he requested, unless he pays for a higher priority.

SPOT, Pleiades, Ikonos, Quickbird, Formosat-2, Rapid Eye and most radar systems are of  "SODA" type.

 

  • the Systematic Acquisition Satellites (SAS)

The image provider defines the zone to observe at the beginning or the satellite mission, and these zones are observed at each overpass of the satellite. In some cases (LANDSAT, Sentinel-2), the acquisition zones covers all lands, while on other cases (Venµs, SPOT4-(Take5)), the acquisition may only cover a few preselected sites.

 

Usually, SODA provide a better spatial resolution, while usually, the SAS provide a better temporal resolution. The SODA images must generally be purchased, since the resource is limited, while the SAS images are usually free of charge. There were periods when LANDSAT images were sold, but they encountered little commercial success, while their success is huge now that they are free of charge. Finally, the SODA are best suited to applications for which the acquisition date is not very important and for which a high resolution is essential, for instance urban studies or monitoring of ecological corridors, while the SAS are better suited to surfaces which quickly evolve, such as natural surfaces or farm lands, and they are best suited to automatically produce detailed land cover maps.

Oppositely to the US who, thanks to LANDSAT, have been working with SAS images, in Europe, users are much more trained to use SODA images such as the ones provided by SPOT. This situation should change radically, first with LANDSAT 8, which is much easier to access in Europe than LANDSAT 5, but above all with Sentinel-2, but the adaptation to this kind of data will require a lot of work and some time. New processing methods and new applications must be developed, which was one of the aim of SPOT4 (Take5) data set.

 

The V2.0 of SPOT4 (Take5) data set is available.

Voilà ! The new version (V2.0) of SPOT4-Take5 data set is available, for the 45 sites. I would like to thank the development and processing teams of MUSCATE center in CNES, who work for THEIA, the image quality teams at CNES (SI/QI and SI/MO), and of course Mireille Huc at CESBIO, for the production of this new version, which finally required a lot of work.

The product version number is not included in the filenames, but you can recognise a V2.0 product by looking into the xml metadata file :

<METADATA>
  <HEADER>
    <VERSION>2.0</VERSION>

 

This reprocessing brings the following new features :

    Quicklooks are now provided with the images. The clouds are circled in green, the shadows in black, water in blue and snow in pink.

  • We provide quicklooks on which you can see the cloud and shadows masks
  • We enhanced the quality of the ortho-rectification :
    • By changing the référence ortho-image for the sites in France (GEOSUD, processing done by the french institute for geography IGN)
    • by replacing the LANDSAT 5 otho-images by LANDSAT 8 images for most other sites outside France. LANDSAT 8 geometric performances are enhanced compared to  LANDSAT 5.
    • however, for a few sites (Borneo, Gabon, Congo (1,CNES), CCRS, Cameroun), no clear LANDSAT 8 images was available yet and we had to keep the LANDSAT 5 reference.
      • It's not too bad for Congo, CCRS et Cameroon, as LANDSAT 5 references where quite good, for Gabon, we used a reference made with the cloud free image obtained with SPOT4-Take5, and finally, we just have Borneo site for which the level 1C obtained are quite bad with large registration errors (I am sorry Jukka)
      • A large enhancement of the performances has been observed for Sumatra, Gabon and Congo (2,ESA), for which the first version was quite bad.
  • SPOT4 radiometric calibration updated
    • A the end of SPOT4's life, my CNES colleagues updated its absolute calibration. Spot calibration is obtained using desert sites, using another satellite as reference. Up to now, it was POLDER, but now it is MERIS/ENVISAT. Moreover, the calibration coefficients we used in the first version had been extrapolated from older measurements, while now recent measurements have been used. The differences are not too big, except for the near infra-red band which varied by 4%..
  • The level 2A have been reprocessed with a new version of the aerosol model, with larger aerosols. The previous model had been tuned for sites in France, but we found that  the larger particles fitted better the in situ data on all the sites.
  • For users of mountain sites, we added a few flags about the correction of terrain effects. If the slope is in the shade, or nearly in the shade, the correction we have to do is infinite ! We limited the value of the correction and flagged the pixels for which we had to limit it in the .DIV files.
  • And at last, the Maricopa site was finally processed. This site was acquired under two angles, one from the East, one from the West. It has therefore been observed twice every 5 days under different viewing angles. Such a case was not anticipated in our prototype, and we had to correct it. The site has been divided in 2 sites Maricopa_J1 for observations from the West, and Maricopa_J5 for observations from the East. This site, which benefits from New Mexico blue skies, is a very interesting one for remote sensing geeks, as it combines multi angular and multi-temporal observations at constant angles !

SPOT4 (Take5) reprocessing

The SPOT4 (Take5) reprocessing nears its end at THEIA. All the Level 2A products have been produced, but for 4 sites (Provence, Alpes, Sudmipy E and W), for which the processing is on-going. We are now transfering the data to the distribution server. It should take just a few days.

 

Le retraitement de SPOT4 (Take5) à THEIA est presque terminé. Tous les produits de Niveau 2A ont été fournis sauf pour 4 sites (Provence, Alpes, Sudmipy E et O), pour lesquels le traitement est en cours. Nous commençons le transfert des données vers le serveur de distribution, ce qui devrait prendre à peine quelques jours.

 

 

The directional effects, how they work

Riddle : from which of these two ballons was the picture taken ? Solution is at the end of the post.

Among Sentinel-2, LANDSAT, Venµs or SPOT4 (Take 5) features, there is one which is frequently forgotten: it is the possibility to observe all lands every 5th day under constant viewing angles. This way of observing limits the directional effects which are one of the most perturbing effects for reflectance time series. Yet, these effects are not always known by the users of time series of remote sensing images.

The way directional effects modify the reflectances is highly visible n the pictures below, which were taken from an helicopter with the same parameters except for the viewing angles. The image on the left was taken with the back to the sun, in the backscattering direction, while the picture on the right was taken at 90 degrees from that direction.

 

Conifer forest observed from an helicopter, in backscattering direction (the helicopter shadow is visible). Note the nearer from the helicopter shadow, the higher thereflectance, as tree shadow are no more visible Conifer forest observed from an helicopter, at 90 degrees from the backscattering direction. Reflectance is much lower since the shadows cast by the trees are visible as well as the shadows cast by the needles on the needles below (Pictures F.M. Bréon)

 

Depending on the observation angles and the solar angles, the reflectances measured by a satellite will change a lot, and we can therefore talk of "reflectance anisotropy", even if "directional effects" is the most frequently used locution. The way they change depends on the surface type : a flat sand desert will have little anisotropy (see next figure on the left), and the surface is said "quasi lambertian". On the contrary, a calm water surface will behave as a mirror, and will exhibit a very strong reflectance peak on the direction opposite to the sun direction, with regard to the vertical. Finally, vegetation always exhibits reflectance peak in the back scattering direction, for which the solar and viewing angles are quasi identical (see the plot below, on the right). On this plot, a reflectance variation greater than 30% can be observed in a couple of degrees. This phenomenon is called Hot Spot, and it is due to the fact that from this direction, one can only see the sunlit faces. Finally, the plot shows that for an angle variation of 40 degrees, the surface reflectance may change by a factor two. The directional effects should thus not be neglected.

 

Reflectances of a desert, observed by the POLDER instrument, as a function of the phase angle (angular distance to the backscattering direction). In red, the Near Infrared band, in green, the red band. Reflectances of a cropland, observed by the POLDER instrument, as a function of the phase angle (angular distance to the backscattering direction). In red, the Near Infrared band, in green, the red band.

 

The wide field of view instruments, such as MODIS, SPOT/VEGETATION, MERIS, VIIRS or Proba-V, and the high resolution ones with a pointing capability, such as SPOT, Rapid-Eye or Pleiades, deliver time series acquired under changing angles. Their reflectances time series are thus very noisy if no correction is attempted. NDVI time series are less noisy, because both red and Near Infrared spectral bands exhibit similar variations. Several correction methods were implemented, but their results are far from perfect.

 

In order to avoid all these troubles, my CESBIO colleagues F.Cabot and G. Dedieu invented the RHEA concept, which consists in putting the satellite on an orbit with a short repeat cycle (1 to 5 days), in order to observe a given site under constant angles. The VENµS satellite stems from this concept, and Sentinel-2 and SPOT4 (Take5) as well. Formosat-2 has also a repeat cycle of one day, but this feature is mainly due to the fact that the Taiwan island can be observed every day from that orbit. Regarding LANDSAT, I do not now if its designers wanted to minimize directional effects, but of course their choice was a good choice.

Thanks to the satellites that observe under constant viewing angles, the noise on time series is really decreased, as shown on the plot below, which gives the surface reflectances  of a wheat pixel (24*24 m²) in Morocco, observed by Formosat-2 during a whole growing season.

Surface reflectances as a funcion of time for a wheat pixel in Morocco.

Finally, it is the hot spot phenomenon, which gives the solution to the riddle above, since the balloon on the left is surrounded by a brighter halo. It means that the direction around the left ballon is the backscattering direction, and therefore that the observer was on this ballon. This is also proven by the complete photograph (taken by A. Deramecourt, a CNES colleague).  I think my colleague saw some poetry in the two balloons hugging, which I hope you still can  appreciate, while, because of my professional bias, I only see a mere hot-spot.

 

 

Le retraitement de SPOT4 (Take5) est en cours / SPOT4 (Take5) reprocessing on its way.

Le retraitement des données SPOT4 (Take5) est en cours dans le centre MUSCATE de THEIA, au CNES. Les niveaux 1C ont été produits, le traitement des niveaux 2A commence aujourd'hui. Nous avons perdu une semaine avec un bug qui a été trouvé à la dernière minute sur des données FORMOSAT-2, et bien qu'il ne soit pas très probable qu'il se produise sur SPOT4 (Take 5), nous avons décidé de le corriger avant ce retraitement. Le traitement des Niveaux 2A devrait être terminé dans une semaine, et les produits seront disponibles pour la distribution la semaine suivante.

The SPOT4-(Take5) reprocessing is on its way in the MUSCATE processing center at THEIA, CNES. The level 1C products have been produced, and the level 2A should start today. A last minute bug on the level 2 was found when processing some Formosat-2 data, and although it was not likely to happen on SPOT4 (Take 5), data, we decided to correct it before launching the reprocessing. The level 2A processing should end next week, and distribution will start the week after.

Sentinel-2 Agriculture

We are very proud to tell that our consortium was selected by ESA for the S2-Agri call for tender.

 

Our consortium is built from the following partners :

 

The S2-Agri project, whose website was just created, aims at showing on a large scale project, the capabilities of Sentinel-2 mission for agriculture monitoring, by providing, after consulting several "champion" users, and open source processing software, that will provide the following types of products :

 

  • periodic synthese of surface reflectances (Level 3A products)
  • a crop mask
  • a map of the main crops (see the image below, and the post on land cover maps)
  • some vegetation indices or biophysical variables

Example of a land cover map automatically generated by a software developed by Isabel Rodes (CESBIO), from LANDSAT 5 and 7 data in 2010. This land cover map was produced by I. Rodes, in the framework of a methodological PhD thesis, it is not as specialized for Agriculture as the ones that will be produced for S2-Agri project. It still already provides 3 agriculture classes : winter crops, summer corps, and meadows.

 

This project, which started on January 31st, 2014, will be carried out in three phases, each with an approximative duration of 1 year.

  1. A test phase, to develop, tune and validate methods and products, on 13 sites scattered around the world, this phase will mainly rely on SPOT4-Take5 data, complemented by LANDSAT 8 or RapidEye images. Several sites will be selected within the JECAM network.
  2. A development phase, during which the production system will be built, and prototype products will be issued and tested.
  3. A demonstration phase, based on the first year of Sentinel-2 acquisitions, for which 3 entire countries (> 500 000 km²) plus 5 sites of 300x3000 km². At least 2 of selected  the countries are in Africa.

At the end of the project, the production system will be released as an open source software by ESA, and

A l'issue de ce projet, le système de production sera disponible en open source auprès de l'ESA, and given the amount of work, we will have won dark circles around our eyes!

 

The level 3A products

Among the products prepared to be processed by the THEIA land data center, the level 3A product was not yet described in this blog. The level 3A products provide a monthly synthesis of the level 2A. These products should be very useful for the following reasons :

  • The level 3A, produced once a month, uses up to six times less volume than the level 2A products acquires during a month.
  • The level 3A provides a regular time sampling of the reflectances variation, while the level 2A sampling is dependent on the cloud cover
  • Several processing methods and applications are hindered by the data gaps due to cloud cover. The level 3A product aims at minimizing the residual gaps.

 

Thanks to SPOT4 (Take5) data set, we were able to try and test several methods to produce level 3A products on varous types of landscapes and climates. This work, suprvised by Mireille Huc and myself, is performed by Mohamed Kadiri, at CESBIO, and is funded by the CNES budget of THEIA Land Data Center. Our method consists in computing, foe each pixel, a weighted average of the surface reflectances of the cloud free observations, obtained within a N day distance frome the central date TO of the level 3A product. For instance, the example below was obtained with N=21, for the 15 th of each month. As a result, the level 2A used in the average for the level 3A product dated on March the 15th, were acquired from Feruary the 24th to April the 4th.

 

 

The weighted average gives more weight to

  • the cloud free images
  • the pixels which are far from clouds
  • the images with a low aerosol content
  • the images acquired near the level 3A product date.

Les values of the weight and of the duration N, have a large influence on the product data quality. To tune their values, we set up three quality criteria :

  • The percentage of residual data gaps for which all the observations were cloudy
  • The difference of the level 3A reflectances with the values of a selected level 2A product acquired near the central date T0.
  • A measurement of the artefacts standard deviation. The artefacts appear near the borders of data gaps that affect one of the dates used in the level 3A synthesis.

 

For instance, here are the performances obtained on the Versailles site, which was heavily clouded in the spring of 2013. For this site, one can note, that the residual gap percentage is very low despite the bad weather, confirming that Sentinel-2 should be able to provide cloud free Level 3A products each month. For this site, the optimal duration of the synthesis is somewhere between 2* 21 and 2*28 days.

 

Performances obtained for Versailles SPOT4(Take5) site, for several values of the half-period N. In red, the residual percentage of data gaps (scale on the right), in yellow and green, the maximum value of the difference of the level 3A to the central level 2A, for resp the best 70% and 95% of pixels. In blue, the residual error standard deviation.

 

 

 

For Sentinel-2, the level 3A will have to include a correction for directional effects, in order to use in the same level 3A product, the data acquired from different satellite tracks, from different viewing angles. Finally, as an option, we might include a gap-filling method to fill the residual gaps.

In short, we still have work to do. A comparison with the classical NDVI Maximum Value Composite is provided in this post.