| Issue |
E3S Web Conf.
Volume 645, 2025
The 1st International Conference on Green Engineering for Sustainable Future (ICoGESF 2025)
|
|
|---|---|---|
| Article Number | 01003 | |
| Number of page(s) | 10 | |
| Section | Green Engineering and Sustainable Technologies | |
| DOI | https://doi.org/10.1051/e3sconf/202564501003 | |
| Published online | 28 August 2025 | |
Multimodal AI framework for Indonesian butterfly classification using vision-language models and RAG-based reasoning in green engineering applications
1 Department of Information Technology Education, Faculty of Engineering, Universitas Negeri Surabaya, 60231, Surabaya, Indonesia
2 Graduate Institute of Network Learning Technology, National Central University, 320314, Taoyuan, Taiwan
3 Department of Electronics and Informatics Engineering Education, Faculty of Engineering, Universitas Negeri Yogyakarta, 55281, Yogyakarta, Indonesia
* Corresponding author: This email address is being protected from spambots. You need JavaScript enabled to view it.
Abstract
Biodiversity loss in ecologically rich regions such as Indonesia underscores the need for sustainable, scalable species monitoring systems. While prior studies have explored deep learning and vision-language models for biological classification, most focus on generic benchmarks or high-resource environments, often lacking structured, domain-specific output. To address this gap, this study proposes a lightweight multimodal AI framework that classifies Indonesian butterfly species using vision-language reasoning and retrieval-augmented generation. The motivation lies in enabling accurate and interpretable ecological monitoring in resource-constrained settings. The system accepts image input via a mobile-responsive interface, processes it through GPT-4 Vision, and outputs six structured attributes: English name, Indonesian name, scientific name, butterfly family, population location, and endangered level. A total of 120 classification sessions were conducted using curated images of both Indonesian and non- Indonesian butterflies. Results show an overall accuracy of 85%, with high field completeness (mean: 4.58 out of 6), consistent reasoning across image quality levels, and low hallucination and latency rates. These findings confirm the system’s viability for near-real-time classification and ecological reporting. The framework supports sustainable AI deployment for biodiversity conservation and offers a replicable model for domain-specific species monitoring in developing regions.
Key words: Artificial Intelligence (AI) / Sustainable Technology / Green Engineering / Vision-Language Models / Biodiversity Monitoring
© The Authors, published by EDP Sciences, 2025
This is an Open Access article distributed under the terms of the Creative Commons Attribution License 4.0, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.

