CocoSense: Coconut Tree Detection and Localization using YOLOv7

. Coconut farming in the Philippines often needs help with challenges in efficient tree monitoring, directly affecting its productivity and sustainability. Although prevalent, traditional methodologies, such as field surveys, exhibit labor intensiveness and potential data inaccuracy constraints. This study sought to leverage the capabilities of the YOLOv7 object detection algorithm to enhance coconut tree monitoring. Our objectives centered on (1) precise detection of coconut trees using orthophotos, (2) their enumeration, and (3) generating accurate coordinates for each tree. The DJI Phantom 4 RTK unmanned aerial vehicle (UAV) was used to capture high-resolution images of the study area in Tiaong, Quezon. Post-acquisition, these images underwent processing and annotation to generate datasets for training the YOLOv7 model. The algorithm's output shows a remarkable 98% accuracy rate in tree detection, with an average localization accuracy of 86.30%. The results demonstrate the potential of YOLOv7 in accurately detecting and localizing coconut trees under diverse environmental conditions.


Introduction
The agricultural sector, especially coconut farming, plays an essential role in the socioeconomic framework of countries like the Philippines.Despite its significance, this sector encounters numerous challenges, particularly in accurate and efficient coconut tree monitoring, which substantially impacts its overall productivity and sustainability [1].Despite their prevalence, traditional monitoring mechanisms, such as field surveys and visual inspections, have been marked by shortcomings, including their labor-intensive nature, the substantial time investments required, and an inherent subjectivity that potentially compromises the accuracy and reliability of data obtained [2].
The emergent utilization of remote sensing technologies has been marked as an innovative solution, providing a comprehensive view of different forest conditions and supporting precise farm management [3].By generating accurate, timely maps, images, localization, and visions, these technologies facilitate adept land-use planning, which allows for meticulous spatial analysis of the land and helps identify suitable zones for replantation or expansion [4].This technological adoption has led to a more informed decision-making process in farm management, resource allocation, and sustainable land use.
In light of this, the advancement of object detection algorithms like YOLOv7 introduces promising functionalities and accuracy for refining agricultural remote sensing technologies.In particular, YOLOv7 has been used in many remote sensing applications such as vegetation detection, landscape feature identification, and more [5][6][7][8].Its unique design and effective methods for identifying and classifying objects make it ideal for tasks requiring accuracy and speed [9].
The objective of this study encompasses the development of a system that leans on the superior capabilities of YOLOv7, specifically aimed at (1) accurate detection of coconut trees utilizing orthophotos, (2) enumeration of the coconut trees, and (3) generation of precise coordinates of each identified tree.This robust methodology aims to augment the efficiency and precision of coconut tree identification in agricultural settings.Also, it underscores the imperativeness of high-quality training data, meticulous model configuration, and continuous monitoring to secure the achieved accuracy.

Unmanned Aerial Vehicle (UAV)
As shown in Fig. 1, the DJI Phantom 4 RTK in this study is a popular unmanned aerial vehicle (UAV) designed for precision mapping.The UAV has a takeoff weight of approximately 1,391g and has a 20-megapixel built-in camera for high-resolution image capture.The DJI Phantom 4 RTK has a typical flight duration of 25 to 30 minutes, allowing for comprehensive coverage of the designated study areas.Its Real-Time Kinematic (RTK) module ensures precise positioning data for accurate aerial surveys.Additional batteries were employed for extended flight sessions and subsequent validation to ensure exhaustive data collection across the research terrain.

Study Area
The study was conducted at the Quezon Agricultural Research and Experiment Station (QARES) in Tiaong, Quezon.This station, covering an area of approximately 100,000 square meters, is a hub for agricultural research where a diverse range of trees, including coconut, mango, and cacao, are cultivated and subjected to scientific study (see Fig. 2).
QARES is responsible for the research endeavor and innovation of Agricultural applications in the Quezon Province.QARES also facilitates various training to disseminate

Flight Path
For mission planning, the UGCS Software was employed, as illustrated in Fig. 3.An altitude of 100 meters was designated with a cruising speed of 7 m/s and an overlap of 80%.These parameters were fine-tuned to ensure meticulous data collection, leading to a combined flight time of 48 minutes to cover the entire study region.Four distinct flights were undertaken to cover the specified research area comprehensively.
To establish control points, the team utilized concrete nails and marked locations on the ground with spray paint.Furthermore, a base setup was initiated using the Emlid Reach (RS2), calibrated to the coordinates with latitude 13.94632306, longitude 121.37139658, and a height of 100.2460 meters.

Orthomosaic Generation
The ortho-mosaic, as shown in Fig. 4, was generated using the Agisoft Metashape software.The process was initiated with photo alignment, which was subsequently optimized.A dense cloud was constructed, followed by creating a mesh and developing a Digital Elevation Model (DEM).The final step was the assembly of the ortho-mosaic.The software processed 431 images, linked by 560,044 tie points.Additionally, 431 medium-quality depth maps were produced to aid the reconstruction.

Datasets Preparation
The ortho-mosaic was randomly partitioned during dataset preparation to produce training and test datasets.Specifically, two-thirds of the images were allocated for training, while the remaining one-third was reserved for testing and model validation.

Datasets Annotations
In the dataset annotations process from the ortho-mosaic, LabelMe software was employed (see Fig. 5).LabelMe, a web-based annotation tool, facilitated the manual annotation of coconut trees within the ortho-mosaic.Each tree was marked with a bounding box and a corresponding label indicating it as a coconut tree.This activity produced a dataset tailored for identification and localization in subsequent analysis stages.

YOLOv7 Implementation
This study uses the architectural design of the YOLOv7 model, as detailed by its main developers [10,11].In implementing the YOLOv7 in CocoSense, the coconut orthomosaic images from LabelMe are first processed by the model's primary structure, which is composed of layers that extract distinct features from the imagery.These features help in identifying specific attributes of coconut trees, which can be seen in Fig. 6.Following this, the model uses detection layers that rely on set patterns, known as anchor boxes, to predict where the coconut trees are located and how likely they are to be coconut trees.The output of this method is a set of marked areas of the coconut trees on the image.

YOLOv7 Model Training
The YOLOv7 model was trained using datasets derived from the LabelMe Software.The annotated coconut tree images, generated through manual annotation within the LabelMe interface, served as the input for training data.The training and subsequent testing were   7. Of the actual coconut trees, 98% were correctly identified by the model.Moreover, the model predicted the background with 95% accuracy, while a minor error was observed in which 5% of the background was mistakenly recognized as coconut trees.This indicates that while the model is proficient at detecting coconut trees, a slight improvement in distinguishing trees from the background is still achievable.The YOLOv7 showed a remarkable result on tree detection despite the limited datasets, with an average F1 Score of 0.9655 and an accuracy of 0.9650.

Coconut Tree Detection and Localization Validation
The three distinct sample areas in Fig. 8 within the testing location were delineated and selected for testing to validate the detection and localization.Coconut trees within these areas underwent ground survey validation to ensure the specified image corresponds to actual coconut trees in the area.Subsequently, these validated images were processed through the YOLOv7 detection system.The detection output from YOLOv7 was compared with the ground truth data to assess the model's accuracy.The results of testing are shown in Table 1.The spatial locations of the coconut trees were also extracted from the orthomosaic using the 'CoCoSense' localization technique.The algorithm's output, which locates the trees' positions, was cross-referenced with ground truth data to ensure precision and accuracy.The sample result from a total of 60 testing locations is shown in The YOLOv7 algorithm remarkably detected coconut trees across the three test areas with an average accuracy of 97.83%.Notably, the algorithm's performance was better in Area 1, where coconut trees are predominantly situated in clear regions with fewer surrounding trees.Nevertheless, a slight dip in accuracy is observed in Area 2, where coconut trees are densely clamped with other tree species.

Conclusion
The objectives of this study were to utilize the YOLOv7 to detect coconut trees from orthophotos accurately and to locate the position of each coconut tree.With the model's implementation, a detection accuracy of 98% was attained.Furthermore, the accuracy of localization is 87.35% on average.YOLOv7's outstanding accuracy in validation testing and localization underlines its potential as a significant tool for coconut detection.Farmers may make informed judgments about crop management, resource allocation, and intervention techniques with such high precision.YOLOv7's performance in coconut detection demonstrates the progress of computer vision techniques and deep learning algorithms.It is crucial to highlight, however, that such great accuracy necessitates a well-annotated and diversified training dataset and thorough model construction and optimization efforts.This technology has significant promise for the coconut industry, allowing farmers to manage plants and optimize resources successfully.

,
03015 (2024) E3S Web of Conferences https://doi.org/10.1051/e3sconf/202448803015488 AMSET2023 relevant and practical technologies to farmers and other stakeholders to continue what they have started and work together to improve their lives.

Fig 4 .
Fig 4. The Orthomosaic of the Tiaong study area was generated using Agisoft Metashape software along with the snippets from the 431 raw images used.

Fig. 7 .
Fig. 7. YOLOv7 Training Testing Result using Confusion Matrix.The model's training result is shown in Fig.7.Of the actual coconut trees, 98% were correctly identified by the model.Moreover, the model predicted the background with 95% accuracy, while a minor error was observed in which 5% of the background was mistakenly recognized as coconut trees.This indicates that while the model is proficient at detecting coconut trees, a slight improvement in distinguishing trees from the background is still achievable.The YOLOv7 showed a remarkable result on tree detection despite the limited datasets, with an average F1 Score of 0.9655 and an accuracy of 0.9650.

Table 1 .
Testing Result of the YOLOv7 on the Testing Areas.

Table 2 .
Sample Localization Testing of Coconut Trees from Orthomosaic.