Deep learning approach for land use images classification

. CNN (convolutional neural networks) are a category of neural networks that are majorly used for image classification and recognition. This Deep Learning (DL) technique is used to solve complex problems, particularly for environmental protection, its approaches have affected several domains without exception, geospatial world is one vised domain. In this paper we aim to classify aerial images of Tangier region, city located in north of Morocco, by using pixel based image classification with convolutional Neural Networks. Flickr API is used to get our test images dataset. These images are used as input to a pre-trained network Resnet18, a small convolution neural network architecture, which is able to recognize 21 land use classes of images. Our methodology is based on the following steps, first we set up the data, and then we re-train the cited Deep Learning model (Transfer Learning) and perform a quick and visual verification, by generating a labeled map from the geotagged images, labels correspond to class provided by the CNN neural network.


Introduction
Scientific approaches to the environment and development come mainly from awareness of environmental damage to companies.Today, with the acceleration of changes, the question is: how to continue to ensure development for the societies of our planet, for us and future generations, which necessarily take into account the sustainability or extension of resources and the sustainability of our environment?Image classification is an important concept in the field of computer vision which involves classifying images into one of many predefined classes.Today several applications affect the domain of computer vision, we cite as an example: object detection, localization and segmentation.As examples, we can cite Earth System Science [1], remote sensing applications [2], Urban Water Flow and Water Level Prediction [3].Particularly, DL methods have been successfully used to extract patterns and insights from the exploding geospatial data, which allows a better characterization and exploitation of the Earth surface.However, the task aims to simulate the human decision making as much as possible, and the automation of this process for remote sensing images is still a challenge for researchers [4] [5] [6].The combination of data sources, computational power and the recent advances in statistical modeling and machine learning offer exciting new opportunities for expanding our knowledge about geospatial data, many tools are available from the fields of machine learning.Although, geospatial data applications are used for selfdriving cars, smart cities, and smartphones because of their impressive performance, in particular on spatiotemporal context.
The purpose of this study is to use areal geotagged photos crawled from Flickr platform [7], to analyze landscape characteristics of photos posted by users.This can be done by exploiting deep learning models capabilities by re-training the model.In the following sections we will touch on the use of deep learning models in the geospatial applications context, and we proceed to do an extraction of abstract features.

Land cover and land use changes
The study of changes in land cover and use therefore emerged as being a fundamental component of researches and modified the ability to observe and monitor on how human has changed land use over the past few centuries.The methods of image processing have continued to evolve, in particular by offering new algorithms allowing them to be characterized, detected and tracked [8].
Changes in land cover and land use are thus generally due to multiple factors that interact with each other and vary in time and space.Based on probabilistic approaches, artificial intelligence (neural networks), statistics (logistic regressions, multi-criteria analysis, etc.), or even spatial analysis (GIS) The community scientist has indeed for more than 20 years, made very great progress on the observation, monitoring and modeling changes in land cover and use, whether at global, regional or local scale [9].

Deep Convolutional Neural Networks
Convolutional neural networks (CNNs) to date are the most powerful models for classifying images and are frequently working behind the scenes in image classification.They can be found at the core of everything from Facebook's photo tagging to selfdriving cars.They're working hard behind the scenes in everything from healthcare to security.However, in general, they consist of convolutional and pooling layers, and the following architecture is the most popular found in the literature.

Convolutional Layers
A product is produced between a filter (a twodimensional array whose weights will be adjusted during training) and the input image.This filter is usually 3 × 3 or 5 × 5.The multiplication of these layers within the network will make it possible to extract increasingly complex features which will ultimately make it possible to predict a membership class for the item present in the image.This is why we talk about Deep learning.

Pooling Layers
They allow under sampling that will compress the size of the image and reduce the computational cost of subsequent layers.In general, a maximum or average function is used.

Learning step
Our strategy in this paper is to take an existing network that classifies well a large collection of areal images and apply to it a Transfer Learning [10].Transfer Learning is a technique widely used in practice and easy to implement.It requires having a neural network already trained on another dataset which is unrelated to our dataset.We chose Resnet18 [11] Deep Learning model, a small convolutional neural network architecture that works well in most cases.It has 11, 714 and 624 trainable parameters, which are already tuned for this transfer learning.It must be mentioned that this network was already used of areal images were manually extracted from the USGS National Map Urban Area Imagery collection for various urban areas around the USA country [12].The precedent dataset is referred as UCMerced Data, one of the most valuable land use imagery datasets in the machine learning domain.This data consists of 100 images per class with 21 land use classes.The following figure shows a sample of the images with class names pull out from the UCMerced dataset.

The dataset
To make a decision on new aerial images by using pixel based image classification and the resnet18 deep learning model, Flickr platform is used as data collection source, specially the PHOTOS GEO library [4], this API returns a list of photos related to a given geographical area.For this study, the geographical area in which we are interested to is limited by four points with longitude and latitude coordinates: P1(35.800637,-5.925502), P2(35.802191,-5.737578), P3(35.703828,-5.719663) and P4(35.708875,-5.950881).The rectangle limited by these points correspond to the bounding box limiting Tangier city metropole, where we are looking for areal images to be used for classification

Results interpretation
After running the Resnet18 model, learning the data takes several minutes depending on machine power and data size.The following table shows training results.The training loss and validation loss decreases after 10 epochs, we calculate the error rate by this formula: The last one rate indicates to be around 0.060 or 94% accurate.Also, we trained a deep learning model using geospatial data and obtained an accuracy of 94%.The high confusion is detected between the following three classes: dense residential, medium residential and sparse residential which gives an idea of the impact on the environment and human health.It is obviously a question of where to place the dividing line between these three types of classes.Usually, the model works well with 1 or 2 badly labelled images.
We also generate a map with all Flickr areal images labels used for image classification, we used to do a quick visual verification.

Conclusion
Generally, machine learning, and deep learning in particular, presents a future research direction provide powerful tools to create interesting models for geospatial data applications.In this paper, we have used a pretrained deep learning model to classify land use from satellite images for environmental protection.In the future, we would like use a deep learning neural networks with a training model which takes into social geotagged images to detect automatically frequently visited places.

Fig. 3 .
Fig. 3. Sample images with class names from UCMerced dataset To adapt the model to our context images used for training the model was collected from the Flickr platform, we set up the data by using Fastai Library [13], to set the name for each image file, create a DataBlock and load the data to Pytorch [14].Our dataset images are organized to many folders, each one has the name of the corresponding class.Images are then retrieved from each folder, splited to training and test datasets and mapped to the class labels related to the folder name.In the next part, we train a state of the art deep learning model for the geospatial data, classifying satellite imagery into 21 classes.

Fig. 4 .
Fig. 4. The geographical area, region of interest The following figure shows examples of images extracted from the geographical area of tangier.

Fig. 5 .
Fig. 5. Examples of images extracted, with class names in order: beach, forest, freeway and golf course.

For
the purpose of classifying Flickr images, we use our trained model to predict the class of each presented areal image.A confusion matrix gives a better visual interpretation of the performance of re-trained model.It is used to compare the predicted class with the real one.The following diagram shows the confusion matrix of the dataset.

Fig. 7 .
Fig. 7. Map generate with all Flickr images used in deep learning image classification.

Table . 1
Training results