Issue |
E3S Web of Conf.
Volume 531, 2024
Ural Environmental Science Forum “Sustainable Development of Industrial Region” (UESF-2024)
|
|
---|---|---|
Article Number | 03009 | |
Number of page(s) | 6 | |
Section | Mathematical Modelling of Energy Systems | |
DOI | https://doi.org/10.1051/e3sconf/202453103009 | |
Published online | 03 June 2024 |
Implementation of data parsing technology using neural network and web driver
1 Siberian University of Science and Technology, 660037, Krasnoyarsk, Russia
2 Siberian Federal University, 660041, Krasnoyarsk, Russia
As a rule, data parsing is used to quickly obtain information from various web resources for further study and use. For parsing, you can use both specialized online services and desktop applications. Unfortunately, existing parsing technologies have some limitations. For example, it is often difficult to parse dynamic web pages and classify information obtained through parsing. New approaches are needed in implementing data collection and analysis - using language models and software (web driver) that simulate human actions when working with websites. The web driver assists in accessing data from dynamically updated sites, while artificial intelligence technologies help correctly recognize and classify data. This technology can be used to create parsers for real estate agencies, employment services, university admission committees, advertising campaigns, and financial organizations.
© The Authors, published by EDP Sciences, 2024
This is an Open Access article distributed under the terms of the Creative Commons Attribution License 4.0, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.