Issue |
E3S Web Conf.
Volume 522, 2024
2023 9th International Symposium on Vehicle Emission Supervision and Environment Protection (VESEP2023)
|
|
---|---|---|
Article Number | 01046 | |
Number of page(s) | 8 | |
DOI | https://doi.org/10.1051/e3sconf/202452201046 | |
Published online | 07 May 2024 |
Annotation method of risk data in a certain field based on pattern matching
Military Science Information Research Center, Academy of Military Sciences, China
* Corresponding author: zhaoyingxiao123@126.com
With the development of information technology and the increasing complexity of industrial technology, there is an urgent need for a certain field to use big data and artificial intelligence to improve the management and decision-making level. In order to classify the field’s risk text data through intelligent algorithms, analysing the risk distribution and the major problems, this paper researches on the annotation methods of training data in this field. The proposed data annotation method is based on pattern matching, addressing the special problems of risk data annotation in this field (such as strong professionalism, small data volume, high accuracy requirement and timeliness requirements). A new matching pattern is generated through the steps of text segmentation, keyword extraction, pattern preliminary generation, pattern relation tree construction, pattern optimization, pattern generalization, pattern verification, classification and annotation, and final classification and annotation are performed after pattern matching. Performance tests in terms of accuracy, recall rate, and annotation time have shown that the overall performance of the proposed method outperforms that of traditional item-by-item manual annotation, and semi-automatic annotation methods through machine learning. The method described in this paper has strong application value for risk data annotation in this field, and also has certain reference significance for high-density, high-accuracy and high-timeliness data annotation in other fields.
© The Authors, published by EDP Sciences, 2024
This is an Open Access article distributed under the terms of the Creative Commons Attribution License 4.0, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.