Intelligent analysis of Earth remote sensing data on the distribution of phytoplankton and pollutants in coastal systems

. Currently, one of the topical areas of application of artificial intelligence methods in ensuring environmental monitoring of water resources is the analysis of Earth remote sensing images in order to control and prevent potentially dangerous changes in the environment. In the future, algorithms with elements of artificial intelligence form the basis of forecasting and decision-making systems. Systems for ensuring high-quality environmental monitoring can be improved using artificial intelligence methods, in particular, the development and application of special algorithms to prevent emergencies. The aim of the study is to develop an algorithm using artificial intelligence to detect spots of substances of various origins on the water surface. It has been established that the YOLOv4 convolutional neural network is applicable for high-quality detection of oil spots and bloom spots of phytoplankton populations. The developed algorithm was tested on real satellite images and showed an accuracy of 84-94%.


Introduction
Today, issues related to the environmental safety of aquatic ecosystems arise everywhere and require urgent solutions.Since at the moment all control and supervision activities are carried out mainly on the basis of traditional methods of monitoring water resources, then, accordingly, all violations are recorded only after an emergency occurs.The main tasks at the moment are the development of methods for calculating and predicting the occurrence of emergencies, allowing to reach a new level of compliance with international standards for regulating violations of environmental safety, as well as improving existing methods for monitoring the environmental situation.This directly affects the health and safety of the ecosystem of the city, region, country and the world as a whole.
The most detrimental effect on the quality of water resources is pollutants that disrupt the natural balance of the aquatic ecosystem.Such substances include oil and oil products, raw sewage, foaming detergents, water suspensions.Their negative effect consists in the deterioration of the physicochemical properties of water and often in the formation of a surface film and sediment on the bottom of the reservoir, which lowers the oxygen content and causes light attenuation [1].Violation of the natural biological balance entails negative consequences for the ecosystem, which, in the absence of a timely response, can become irreversible, so the introduction of modern technologies in the process of environmental monitoring comes to the fore [6].
Over the past two decades, against the backdrop of environmental degradation, the methods of monitoring and surveying coastal systems have changed qualitatively through the use of advanced information technologies, including artificial intelligence, to process diverse, unstructured data sets.Universal predictive regression models, classification algorithms, computer vision contribute to obtaining reliable analytics in the study of water resources.Models with elements of artificial intelligence make it possible to take into account the features of the process of pollution of water resources under conditions of incomplete weather data and information about sources of pollution, when an analytical description is difficult, and also allows you to visualize the process of distribution of substances.
At the moment, a number of intelligent models have been proposed to detect concentrated concentrations of phytoplankton populations, as well as pollutants in coastal systems.Often, such models are associated with the use of additional equipment, such as unmanned aerial vehicles [2], which allows real-time forecasting, as well as marine radars [17].When analyzing remote sensing data, deep convolutional neural networks such as OSCNet [15] are used to automatically detect oil spills on the sea surface.A VGG-16-based OSCNet is obtained by architecture and hyperparameter tuning with a dark region dataset of spaceborne synthetic aperture radar (SAR).The accuracy, completeness and reproducibility of the model in the study [18] are 94.01%, 83.51% and 85.70%, respectively.Experiments confirm that the classification accuracy based on the CNN architecture is higher than that of other models (eg, LeNet-5, D-DBN and RBF-SVM) [7,9].
A number of works have considered intelligent algorithms for segmentation of SAR dark spots and algorithms for classifying oil spills and their twins [4,5,8,16,19].
Neural network methods in combination with the Local Binary Patterns (LBP) method have proven to be a successful method for processing multispectral images of the water coast based on satellite sounding data in order to identify spotty phytoplankton populations [10][11][12].
The results of the localization of pollutants and phytoplankton populations using artificial intelligence methods make it possible to further apply the resulting information to predict the distribution of substances in various hydrodynamic conditions.
The purpose of this study is to develop an intelligent algorithm for detecting spots of various origins on the water surface on satellite images.The method is based on a convolutional neural network of the YOLOv4 architecture.The scientific novelty lies in the expansion of ideas about the possibility of using intelligent methods in detecting concentrated accumulations of substances of various origins and obtaining an intelligent model that demonstrates high detection accuracy.

Materials and methods
The idea of solving the problem posed is reduced to the analysis of space satellite images (i.e., the actual state of the environment in a certain period of time) and the development on its basis of an operational forecast to prevent violations of environmental safety standards using modern neural network technologies.
For processing images received from space satellites, convolutional neural networks are most successfully used, since they provide partial resistance to scale changes, displacements, rotations, angle changes and other distortions.
Convolutional Neural Network (CNN) is a feed-forward neural network with excellent performance for image recognition [3].The network architecture got its name from the name of the convolution operation, the essence of which is that each image fragment is multiplied by the convolution kernel in stages, and the result is summed up and written to a similar position in the output image.
One of the most popular architectures for solving the detection problem is YOLO (You Only Look Once).This is a real-time object detection algorithm, which is a one-stage object detection network and consists of three parts: backbone, neck and head (Fig. 1).The basis for YOLOv4 can be a pre-trained convolutional neural network such as VGG16 or CSPDarkNet53 trained on COCO or ImageNet datasets.The YOLOv4 network framework acts as a feature extraction network that computes feature maps from input images.The neck connects the spine and head.It consists of a Spatial Pyramid Pooling (SPP) module and a Path Aggregation Network (PAN).The neck combines feature maps from different layers of the backbone network and sends them as input to the head.The head processes aggregate functions and predicts bounding boxes, object-in-box scores, and classification scores in case there are multiple classes.The YOLOv4 network uses single-stage object detectors such as YOLOv3 as discovery heads.At the output, this neural network tells us: 1. Is there an object in a particular grid cell.
2. The class of this object.
3. The intended bounding box for this object (location).

Software implementation of the YOLOv4 convolutional neural network in the MATLAB environment
An intelligent model based on the YOLOv4 convolutional neural network was developed in the MATLAB environment.The algorithm is supplemented with software scripts for artificially expanding the training set in order to increase the accuracy of the neural network.

Preparing the training set
A data set designed to train and test a neural network plays a key role in the application of artificial intelligence methods.The initial data set for this study is 16 images of oil spills in the Gulf of Mexico (April-June 2010) obtained from NASA and Aqua satellites [13,15] and 12 phytoplankton concentration images from Canopus-B images [14] for Azov and Black Seas (August 2021), as well as satellite images of water blooms off the coast of Argentina in the Bay of Biscay off the coast of France.As you know, training a convolutional neural network requires a large number of photos, so the existing dataset was expanded by augmentation.Generating new data based on the existing ones allows us to solve some of the problems with the training sample using improvised methods.Often, in engineering or scientific tasks, there was a lack of data due to the complexity of obtaining / processing / saving them, which is why it is necessary to replace the missing data with modifications of the existing ones.For example, in the case of photographic data, as in this study, the weather conditions of the survey, satellite features, and many other distortions are successfully simulated by image processing methods.By modeling deformations in this way, it is easy to achieve an increase in the quality of the model and increase its resistance to various noises in the input data.
To expand the data set for each of the photographs, the following steps were taken (Fig. 2): rotate in 45º increments and contrast change and display (vertical/horizontal) in four options.

Fig. 2. Increasing the dataset by augmentation.
After the formation of the training set is completed, it remains to label the objects on the images.
Image markup is an essential part of AI development.Annotated images are needed as input for neural network training.In this study, the Image Labeler application was used for marking, which is a way to interactively create various shapes for marking areas of interest (Fig. 3).

Fig. 3. Data labeling
The result of this step is an exported labeled dataset -MAT-file, which is used as input for training the computer vision algorithm based on the YOLOv4 network in the next step.

Training a convolutional neural network
The YOLOv4 Custom Object Detector is based on a pre-trained deep learning network created using CSP-DarkNet-53 as the base network and trained on the COCO dataset.To obtain optimal results, before performing detection on a test set, it is necessary to train the detector on training images.
Anchor fields were scored using built-in functions based on the size of the features in the training data.The size of the training data has been resized to the input size of the network using a helper function, which will improve the speed and efficiency of the deep learning neural network.
An important step in the implementation of the YOLOv4 architecture convolutional neural network is the selection of its parameters.The main network parameters are presented in Table 1.In most deep learning projects, training and validation loss are usually visualized together in a graph (Fig. 4).

Fig. 4. Neural network training
The convolutional neural network was trained on a Supermicro computer (Intel Intel XEON E5-2640v4 2.4GHz), the total training time was 1 hour 27 minutes.

Results and discussion
As a result of training, an intelligent model was obtained that is capable of detecting spots of various origins on satellite images, while the probability of detecting a spot on a test set ranged from 84-94%.Thus, Figure 5 shows the result of detecting an oil slick on test images.The main concentrated mass of the oil product fell within the limits determined by the intelligent algorithm.The probability that a spot was found was 0.94 on the first image and 0.89 on the second.

Fig. 5. Result of oil slick detection
Figure 6 shows a demonstration of the algorithm when detecting the concentration of phytoplankton populations in coastal systems.

Fig. 6. Result of detection of phytoplankton population
The developed model is a scalable algorithm that can be trained on any spots, while the presence of several species in one image will allow not only their localization, but also classification.
Combining the two directions in this study, remote sensing and artificial intelligence methods, provides an optimal method for image analysis, which will speed up the process of information processing and help track changes in the state of water resources in almost real time.

Conclusions
As a result of human intervention in the ecosystem, events associated with the occurrence of emergencies have become more frequent.The human factor, leading to oil spills, excessive phytoplankton blooms, causes a number of problems: drinking water pollution; the death of fish, aquatic mammals in rivers, seas and fish farms as a result of a violation of the physicochemical balance; economic losses due to a decrease in the attractiveness of tourist regions.
Improving the methods of monitoring coastal systems in order to identify spots of various origins in real time will make it possible to quickly respond in the event of environmentally hazardous situations.
Automating the search and localization of substances of various origins in coastal systems using artificial intelligence methods will improve the quality of data collection and processing, which will entail economic benefits.Thus, the developed algorithm using the YOLOv4 convolutional network can be considered as the first stage in the creation of an automated information system for environmental monitoring in coastal systems based on Earth remote sensing data.
The methodology proposed in this study can also be an additional tool for ongoing monitoring programs, and the information obtained with its help can be used to track the recovery dynamics of the affected areas.