The iota2 Land cover processor has processed some Sentinel-2 data


You already heard about iota2 processor, and you must know that it can process LANDSAT 8 time series et deliver land cover maps for whole countries. These las days, Arthur Vincent completed the code that allows processing Sentinel-2 time series. Even if atmospherically corrected Sentinel-2 data are not yet available above the whole France, we used  the demonstration products delivered by Theia to test our processor.


Everything seems to work fine, and the 10 m resolution of Sentinel-2 seems to allow seeing much more details. The joined images show two extracts near Avignon, in Provence, which show the differences between Landsat 8 and Sentinel-2. Please just look only at the detail level, and not at the differences in terms of classes. Both maps were produces using different time periods, and a period limited to winter and beginning of spring for Sentinel-2, and the learning database is also different. Please don,'t draw conclusions too fast about the thematic quality of the maps.


First extract shows a natural vegetation zone, with some farmland (top LANDSAT8, bottom Sentinel-2)


Continue reading

New version of fully automatic land cover map of France for 2014 from LANDSAT8


Over the last months, we worked a lot on our method for Land Cover map production. Three main topics (1) were studied with Arthur Vincent and David Morin at CESBIO :

  1. porting and validating the iota2 processor on the CNES High Performance Computing facilities (HPC);
  2. enhancing the method for reference data preparation. Reference data are used both for training and validation;
  3. developing a stratification method which allows to train and apply classifiers per eco-climatic area, for instance.

Using all these new features, we produced a lot (really a lot!) of maps for the continental France. We just released the 4 following examples, produced using all the available LANDSAT8 data in 2014 :

  • regarding reference data :
    1. including 4 classes of artificial surfaces : continuous urban , dicontinuous urban, road surfaces, and commercial and industrial areas (2);
    2. only one artificial class that gathers the 4 above (3);
  • regarding the stratification method :
    1. using eco-climatic areas (4);
    2. without stratification, but using a fusion of several classifiers trained over different sets of tiles.
The pink urban spot, in the center of brown zone, is the village of Chateauneuf du Pape which is famous for its wine, and the brown color is the vineyard class. Validated !

Continue reading

3 years of Landsat L2A data above France available at THEIA

The MUSCATE processing centre at CNES, which belongs to French THEIA land data centre, produces and distributes all the Landsat 8 data acquired above France at Level 2A. Level 2A products contain surface reflectances after atmospheric correction, and a very good cloud mask.  LANDSAT 8 first routine data had been acquired on April 13th 2013, and THEIA released the data acquired from the first fortnight of April 2015, a few days ago. Three full years of data are therefore now available, and even 6 years if we account for the LANDSAT 5 and 7 data acquired from 2009 to 2011. And very soon, we will start delivering data above the French oversea regions and communities

The atmospheric correction and the cloud detection are processed using the MACCS processor, developped at CESBIO. You have probably already seen several examples of cloud masks in this blog. Regarding atmospheric correction, Camille Desjardins, from CNES, did a validation of the Aerosol optical thicknesses (AOT) over all the Aeronet sites in France, over the 3 last years.   The results are summarized in the plot on the right. For the non-specialists, it is a very good resuls, similar to the state of the art. And as the estimates of AOT condition the quality of atmospheric correction, you may be confident on its quality


The diversity of the users of this product shows that they can be used for a large diversitu of themes and applications. Here is a good example of whan can be done automatically with this data set (here all the data from 2014). Please click on the image for more details.


This land cover map, still in validation, was created by the iota2 software, whose development is led by Jordi Inglada at CESBIO.  For more details, see Jordi's presentaion at the Living Planet Symposium.


The LANDSAT 8  L2A products have been used by 127 different users so far since July last year, when the new distribution server was put on line.  And before that, 98 other users were already using the previous version, probably some users belong to both lists, but I did not count them.

Number of downloaded products since July 2015 13074
Number of users 127


As already observed with SPOT (Take5), few users tell us what they do with the data, which indeed is a good sign, because in case of an issue, we receive questions very quickly. But if you are one of the users, please let you know about how you use the data and the results you get. Your results would even be welcome on this blog !

Fully automatic land cover map generation at country scale over France


Up to now, over France, there is no Land Cover Map generated annually at a decameter resolution.  The Corine Land Cover map, which is widely used, is only produced every 5 years, and 2012 version was issued in 2015. This map is mainly produced using photo interpretation, and therefore requires a very large amount of work.  The very accurate Land Cover layer from IGN (French cartographic institute), is updated regularly, region wise, over a cycle of 3 to 4 years, and therefore only provides the perennial land cover information. Two other products exist, the Global Land Cover 30m produced at LANDSAT resolutions, et the Copernicus HR layers, but with a quite low quality,  for instance on the Landes forest in France.


Thanks to its high resolution observations, Sentinel-2 should enable an automatic generation of land cover maps at country scale. Based on a several years of research at CESBIO,  our project to automatically produce land cover maps over the whole France is gaining momentum. Research efforts are being organised within the THEIA Expertise Center on Operational Land Cover.


The first prototype products were computed using the LANDSAT 8 Level 2A data from Theia, pending availability of a whole year of Sentinel-2 data. The first products span over one third of France, and have 15 to 20 classes according to the versions.


The land cover maps processor is based on Orfeo Tool Box applications, set to music by Marcela Arias, under Jordi Inglada's direction, and with large contributions of several CESBIO colleagues for reference data collection of for the development of processors.


Extract of the version 1 of land cover product, computed using LANDSAT8 data in 2013. Click on the image for an interactive display

Warning :

These prototype products were not created in ideal conditions. The LANDSAT-8 2013 data set starts in April only, as the satellite was not yet operational before. The start of vegetation cycle has been missed. The future operational products will use a complete year of data. Moreover, LANDSAT 8 data do not have the same repetitivity and resolution as Sentinel-2, and therefore, the final map quality is not what we expect from Sentinel-2.


However, it is still the same type of data, and their processing needs to overcome the same difficulties. It is therefore a full-scale test of our methodology. And finally, although not as accurate, the maps have the same nature as our final product and should allow users to get a first idea of the products Theia will deliver.


These products contain errors and must only be considered as a draft. We release them in order to get feedback on their quality and usefulness. Please tell us how they might be useful to you. Tell us also if you find them too inaccurate, or if there is something missing.


Prototype product description and download

These products are delivered under the Open Data Commons Attribution Licence. This license allows you to  :

  • share, copy, distribute and use the data
  • create other products based on the data
  • adapt, change and transform the data

with the following constraint : you have to quote the data source (CESBIO) for any use or distribution of the data.


These maps were processed with Landsat-8 Level 2A data (30 m resolution and 7 spectral bands) obtained with a 16 days revisit. The first images were taken on the 1th April 2013, until the 30th december 2013. Due to cloud cover, every point on the surface was observed between 8 and 25 times, 16 times on average. Some zones in the Pyrenees, because of cloud and snow cover were not observed often, and this causes artefacts on the maps.


Sentinel-2 images, with a better resolution and repetitivity should allow production of far better quality maps.

These maps are made using a machine learning based on reference data bases which provide land cover on a large set of places over France. The following data bases were used :

  • The European Common Agriculture Policy data base for the following classes :
    • annual crops (winter and summer)
    • woody crops (Orchards, Wineyards, Olive groves)
    • permanent meadows
    • estives and moors
  • Corine Land Cover 2012 for the following classes :
    • Dense habitat
    • Industrial or commercial zones
    • Grassland
    • Beaches and dunes
    • Sea and oceans
    • Mineral surfaces
    • Glaciers and permanent snow
  • IGN BD TOPO for the following classes :
    • Water
    • Persistent forest
    • Deciduous Forest
    • Mixed Forest
    • Woody moor

These data bases can have been based on various time period and be older than the satellite time period. Several versions were released to test slightly different nomenclatures.


The following classes were merged

  • estives-moors and woody moors
  • Mixed forests were removed
  • All classes of orchards, vineyards
  • Inland waters and oceans

Product statistics and display are available here..

The full resolution product can be downloaded here.

Real time production of land cover maps without terrain data of the current time period.


With the new availability of repetitive image time series after atmospheric correction over France from the Theia Land Data Centre, it is now possible to imagine the automatic production of land cover maps continuously with the availability of new images.


In the framework of the SYRHIUS project, a prototype was developed at CESBIO to assess the results of this kind of classification method at the scale of a medium scale catchment. The study zone is the Fresquel catchment (937 km2), close to the famous medieval city of Carcassonne. The main crops present in this catchment are cereals, sunflower and vineyards, and also some corn and rapeseed.
A supervised classification is used, based on Support Vector Machines, but for which the learning data base is not derived from terrain surveys held during the time period to process, as in the classical supervised methods. The learning data base which is used is created from previous years observations and from terrains data acquired in the past. Such a method has the advantage of needing no terrain data on the present period, knowing that these data often come too late to allow a real time processing, but it requires a very large data volume from several years. In case of a time period with an exceptional climate, errors might arise if the training data base does not contain the necessary information to recognise the crops.

View of the real time land cover processor


To test this approach, we used the Common Agriculture Policy plot data base, for years 2011 and 2012, for the Fresquel catchment, along with LANDSAT5/7 time series, which allow a time evolution of reflectances for the plots in the data base. Both data sources were used to create the learning data base. which was then use to classify the data of 2013, 2014 and 2015 for the Fresquel catchment.


THEIA LANDSAT8 Level 2A (corrected from atmospheric effects and provided with a cloud mask) are used as input or the processor. Due to the late availability of the Commpon Agriculture Policy data base, we are not able to provide validation figures, but previous campaigns provided Kappa in the 0.65-0.7 range for Midi Pyrénées region.
Of course, at the beginning of the crop season, the available information is not complete and the accuracy might be reduced. For that reason, the nomenclature and the number of classes evolves with the number of available LANDSAT dates. Three key dates are used : end of March, end of July and end of year. For each of the dates a new land cover map is computed with an increased detail level, as shown in next figure.


Three land cover maps are produces along the year, first one (S1) in March, Second one in July (S2), and the last one at the end of the year with an increasing number of classes.

We will however stress the fact that steady observations are necessary, and that on certain years, the cloud cover might degrade the quality of the results, as in the case of spring 2013, for which the LANDSAT observations only started in April. In 2013, some parts of the Area where only observed 3 times along the whole year. The results at the beginning of season are quite bad, but they enhance along the year. For the subsequent years, results are better and should further enhance with the availability of Sentinel-2 and its far better observation frequency.

The SIRHYUS project

The SIRHYUS project aims at developping and setting operationnal services related to managing water resources thanks to the integration, assimilaton and valorisation of satellite earth observation  : Veolia Environnement Recherche&Innovations, Veolia Eau, EDF, G2C environnement, Acri ST, l’UMR TETIS-IRSTEA, le CNES, VERI et le CESBIO. It was funded by the 12th Fonds Unique Interministériel, by the ministry in charge of water  and by the Provence and Languedoc-Roussillon, and the aeronautics and space foundation.

The aim is to provide new services, based on the know how of experience companies. In this framework, CESBIO implemented or enhances methods for 4 products : snow cover, land cover, evaop-transpiration and soil water content. In the future, these products will be applied to Sentinel-2. In this framework, two posts will be published on this blog : tis one, and a second related to evapotranspiration estimates in this same catchment.




Yoann Moreau et Isabelle Soleihavoup

The good things with Sunday work


I am not the kind of person to give lessons or to blame my colleagues, but I have to tell that some of them do not work on Sundays. And their excuses are variable and numerous : family, errands, work in the flat/house or in the garden, the necessity to rest from stressful weeks, do some sports, see friends...


Without meaning to boast, I know how to help you solve these issues that could prevent you to work on Sundays

  • a reason to delay the errands or the garden work to another day ("I have no choice, that's for work")
  • A way to do sports and  to relax
  • A reason to see friends or family  (to teach them how to work on Sundays)


OK, I'll give you my secret :

Moorland with broom shrubs

Mountain meadow

Woody crop (vineyard)

- thanks to the nice autumn in France, and thanks to the ODK* Collect android application, that you may download on Google Play, It was my pleasure to work on the last two week-end. More accurately, I worked at sampling land cover.


15 days ago, I spent my Sunday sampling moorlands in the Pyrenees, thanks to a very nice hike near Tarascon sur Ariège. all types of mountain moors were present, with either ferns, brooms, rhododendrons, blueberries or juniper.

The next week-end allowed me to sample Mediterranean vegetation, and its transition to mountain vegetation in the Fenouillèdes region, at the foot or the Eastern Pyrenees, thanks to a lovely bike tour. Bike is a very efficient way of collecting samples, and it is also an excuse to stop every half mile when the slope is too steep. I tend to sample much more places when going up rather than going down.


At the end of the hike, the data are transfered on a websites that comes with ODK and they will be soon transferred to the CESBIO PostGIS data (Thanks to Jérôme Cros) to finally be used, along with other data sources to train and validate land cover classifications.

Alfalfa meadow


As I did work a lot on Sundays, I have collected about 2000 samples in 18 months that I'd like to release as open data.


A start for a European data base for landcoverl

If you'd like to enjoy working on sundays, please do. adding more users would be a way to provide data to our project to provide annual land cover over France, or to help our Sen2Agri project, with Sentinel-2, starting next year.

If you want to see the data, or to start collecting data, you may use this site, I created a guest account "invite", and the password is composed of the account name followed by the name of my lab, without any space or capital letter. The most recent samples are in OS V2.3 form. If you want to collect data, ask me for a personal account. There is a user guide here.

* ODK : Open Data Kit

Land cover maps quickly obtained using SPOT4 (Take5) data for the Sudmipy site


At CESBIO, we are developing land cover map production techniques, for high resolution image time series, similar to those which will soon be provided by Venµs and Sentinel-2. As soon as the SPOT4 (Take5) data were available over our study area (Sudmipy site in South West France), we decided to assess our processing chains on those data sets. The first results were quickly presented during Take5 user's meeting which was held last October.

1. Experiments

In this post we describe the work carried out in order to produce these first land cover classifications with the SPOT4 (Take5) Sudmipy images (East and West areas) and we compare the results obtained over the common region to these two areas.


Prior to the work presented here, we organized a field data collection campaign which was synchronous to the satellite acquisitions. These data are needed to train the classifier training and validate the classification. The field work was conducted in 3 study areas (figure 1) which were visited 6 times between February and September 2013, and corresponded to a total of 2000 agricultural plots. This allowed to monitor the cultural cycle of Winter crops, Summer crops and their irrigation attribute, grasslands, forests and bulit-up areas. The final nomenclature consists in 16 land cover classes.


The goal was to assess the results of a classification using limited field data in terms of quantity but also in terms of spatial spread. We wanted also to check whether the East and West SPOT4 (Take5) tracks could be merged. To this end, we used the field data collected on the common area of the two tracks (in pink on the figure) and 5 level 2A images for each track acquired with a one day shift.


2. Results

The first results of supervised SVM classification (using the ORFEO Toolbox) can be considered as very ipromising, since they allow to obtain more than 90% of correctly classified pixels for both the East and the West tracks and since the continuity between the two swaths is excellent. Some confusions can be observed between bare soils or mineral surfaces and Summer crops, but these errors should be reduced by using LANDSAT 8 images acquired during the Summer, when Summer crops will develop.

Merging of the land cover maps obtained on the East and West Sudmipy tracks (the cloudy areas were cropped out). The comparison against the ground truth (the black dots on the map to the South-West of Toulouse) results in a kappa coefficient of 0.89 for the West and 0.92 on the East.



This zoom compares the results obtained on the common area of the two tracks (West to the left and East to the right). The two classifications were obtained independently, using the same method and the same training data, but with images acquired at different dates and with different viewing angles. The main errors are maize plots labeled as bare soil, which is not surprising, since this crop was just emerging when the last image was acquired. There are also confusions between wheat and barley, but even on the field, one has to be a specialist to tell them apart.

3. Feedback and retrospective

After performing these experiments, we were very satisfied with the operationnality of our tools. Given the data volume to be processed (about 10 GB of images) we could have expected very long computation times or a limitation in terms of memory limits of the software used (after all, we are just scientists in a lab!). You will not be surprised to know that our processing chains are based on Orfeo Toolbox. More precisely, the core of the chain uses the applications provided with OTB for supervised training and image classification. One just have to build a multi-channel image were each channel is a classification feature (reflectances, NDVI, etc.) and provide a vector data (a shapefile, for instance) containing the training (and validation) data. Then, a command line for the training (see the end of this post) and another one for the classification (idem) are enough.

Computation times are very interesting: several minutes for the training and several tens of minutes for the classification. One big advantage of OTB applications is that they automatically use all the available processors automatically (our server has 24 cores, but any off the shelf PC has between 4 and 12 cores nowadays!).

We are going to continue using these data, since we have other field data which are better spread over the area. This should allow us to obtain even better results. We will also use the Summer LANDSAT 8 images in order to avoid the above-mentioned errors on Summer crops.

4. Command line examples

We start by building a multi-channel image with the SPOT4 (Take5) data, not accounting for the cloud masks in this example :

otbcli_ConcatenateImages -il SPOT4_HRVIR_XS_20130217_N1_TUILE_CSudmipyE.TIF
SPOT4_HRVIR_XS_20130607_N1_TUILE_CSudmipyE.TIF -out

We compute the statistics of the images in order to normalize the features :

otbcli_ComputeImagesStatistics -il otbConcatImg_Spot4_Take5_5dat2013.tif -out

We train a SVM with an RBF (Gaussian) kernel :

otbcli_TrainSVMImagesClassifier otbConcatImg_Spot4_Take5_5dat2013.tif
-io.vd DT2013_Take5_CNES_1002_Erod_Perm_Dissolve16cl.shp -sample.vfn "Class"
-io.imstat EstimateImageStatistics_Take5_5dat2013.xml -svm.opt 1 -svm.k rbf
-io.out svmModel_Take5Est_5dat2013_train6.svm

And Voilà !, we perform the classification:

otbcli_ImageSVMClassifier -in otbConcatImg_Spot4_Take5_5dat2013.tif -mask
EmpriseTake5_CnesAll.tif -imstat EstimateImageStatistics_Take5_5dat2013.xml
-svm svmModel_Take5Est_5dat2013_train_6.svm -out ClasSVMTake5_5dat_16cl_6.tif

Land cover map production: how it works


Land cover and land use maps

Although different, the terms land use and land cover are often used as synonymous. From Wikipedia Land cover is the physical material at the surface of the earth. Land covers include grass, asphalt, trees, bare ground, water, etc. There are two primary methods for capturing information on land cover: field survey and analysis of remotely sensed imagery. and Land use is the human use of land. Land use involves the management and modification of natural environment or wilderness into built environment such as fields, pastures, and settlements. It also has been defined as "the arrangements, activities and inputs people undertake in a certain land cover type to produce, change or maintain it" (FAO, 1997a; FAO/UNEP, 1999).

A precise knowledge of land use and land cover is crucial for many scientific studies and for many operational applications. This accurate knowledge needs frequent information updates, but may also need to be able to go back in time in order to perform trend analysis and to suggest evolution scenarios.


Satellite remote sensing offers the possibility to have a global point of view over large regions with frequent updates, and therefore it is a very valuable tool for land cover map production.


However, for those maps to be available in a timely manner and with a good quality, robust, reliable and automatic methods are needed for the exploitation of the available data.




Classical production approaches

The automatic approaches to land cover map production using remote sensing imagery are often based on image classification methods.


This classification can be:

  • supervised: areas for which the land cover is known are used as learning examples;
  • unsupervised: the image pixels are grouped by similarity and the classes are identified afterwards.

Supervised classification often yields better results, but it needs reference data which are difficult or costly to obtain (field campaigns, photo-interpretation, etc.).




What time series bring

Until recently, fine scale land cover maps have been nearly exclusively produced using a small number of acquisition dates due to the fact that dense image time series were not available.


The focus was therefore on the use of spectral richness in order to distinguish the different land cover classes. However, this approach is not able to differentiate classes which may have a similar spectral signature at the acquisition time, but that would have a different spectral behaviour at another point in time (bare soils which will become different crops, for instance). In order to overcome this problem, several acquisition dates can be used, but this needs a specific date selection depending on the map nomenclature.


For instance, in the left image, which is acquired in May, it is very difficult to tell where the rapeseed fields are since they are very similar to the wheat ones. On the right image, acquired in April, blooming rapeseed fields are very easy to spot.


May image. Light green fields are winter crops, mainly wheat and rapeseed. But which are the rapeseed ones?

April image. Blooming rapeseed fields are easily distinguished in yellow while wheat is in dark green.


If one wants to build generic (independent from the geographic sites and therefore also from the target nomenclatures) and operational systems, regular and frequent image acquisitions have to be ensured. This will soon be made possible by the Sentinel-2 mission, and it is right now already the case with demonstration data provided by Formosat-2 and SPOT4 (Take 5). Furthermore, it can be shown that having a high temporal resolution is more interesting than having a high spectral diversity. For instance, the following figure shows the classification performance results (in terms of  \kappa index, the higher the better) as a function of the number of images used. Formosat-2 images (4 spectral bands) and simulated Sentinel-2 (13 bands) and Venµs (12 bands) data have been used. It can be seen that, once enough acquisitions are available, the spectral richness is caught up by a fine description of the temporal evolution.




What we can expect from Sentinel-2

Sentinel-2 has unique capabilities in the Earth observation systems landscape:

  • 290 km. swath;
  • 10 to 60 m. spatial resolution depending on the bands;
  • 5-day revisit cycle with 2 satellites;
  • 13 spectral bands.

Systems with similar spatial resolution (SPOT or Landsat) have longer revisit periods and fewer and larger spectral bands. Systems with similar temporal revisit have either a lower spatial resolution (MODIS) or narrower swaths (Formosat-2).


The kind of data provided by Sentinel-2 allows to foresee the development of land cover map production systems which should be able to update the information monthly at a global scale. The temporal dimension will allow to distinguish classes whose spectral signatures are very similar during long periods of the year. The increased spatial resolution will make possible to work with smaller minimum mapping units.


However, the operational implementation of such systems will require a particular attention to the validation procedures of the produced maps and also to the huge data volumes. Indeed, the land cover maps will have to be validated at the regional or even at the global scale. Also, since the reference data (i.e. ground truth) will be only available in limited amounts, supervised methods will have to be avoided as much as possible. One possibility consists of integrating prior knowledge (about the physics of the observed processes, or via expert rules) into the processing chains.


Last but not least, even if the acquisition capabilities of these new systems will be increased, there will always be temporal and spatial data holes (clouds, for instance). Processing chains will have to be robust to this kind of artefacts.



Ongoing work at CESBIO


Danielle Ducrot, Antoine Masse and a few CESBIO interns have recently produced a a large land cover map over the Pyrenees using 30 m. resolution multi-temporal Landsat images. This map, which is real craftsmanship, contains 70 different classes. It is made of 3 different parts using nearly cloud-free images acquired in 2010.


70-class land cover map obtained from multi-temporal Landsat data.

In his PhD work, Antoine works on methods allowing to select the best dates in order to perform a classification. At the same time, Isabel Rodes is looking into techniques enabling the use of all available acquisitions over very large areas by dealing with both missing data (clouds, shadows) and the fact that all pixels are not acquired at the same dates.


These 2 approaches are complementary: one allows to target very detailed nomenclatures, but needs some human intervention, and the other is fully automatic, but less ambitious in terms of nomenclature.


A third approach is being investigated at CESBIO in the PhD work of Julien Osman: the use of prior knowledge both quantitative (from historical records) and qualitative (expert knowledge) in order to guide the automatic classification systems.


We will give you more detailed information about all those approaches in coming posts on this blog.