E3S Web Conf.
Volume 284, 2021Topical Problems of Green Architecture, Civil and Environmental Engineering (TPACEE-2021)
|Number of page(s)||9|
|Section||IT and Environmental Risk Management|
|Published online||12 July 2021|
Analytical method for selection an informative set of features with limited resources in the pattern recognition problem
1 Tashkent University of Information Technologies named after Muhammad Al-Khwarizimi, 100200, 108 Amir Temur main street, Tashkent, Uzbekistan
2 Nukus branch of Tashkent state agrarian university, 230100, Nukus Abdambetov st., Nukus, Uzbekistan
3 Karakalpak State University named after Berdakh, 742000, Acad. Abdirov st. 1., Nukus, Uzbekistan
* Corresponding author: firstname.lastname@example.org
Feature selection is one of the most important issues in Data Mining and Pattern Recognition. Correctly selected features or a set of features in the final report determines the success of further work, in particular, the solution of the classification and forecasting problem. This work is devoted to the development and study of an analytical method for determining informative attribute sets (IAS) taking into account the resource for criteria based on the use of the scattering measure of classified objects. The areas of existence of the solution are determined. Statements and properties are proved for the Fisher type informativeness criterion, using which the proposed analytical method for determining IAS guarantees the optimality of results in the sense of maximizing the selected functional. The relevance of choosing this type of informativeness criterion is substantiated. The universality of the method with respect to the type of features is shown. An algorithm for implementing this method is presented. In addition, the paper discussed the dynamics of the growth of information in the world, problems associated with big data, as well as problems and tasks of data preprocessing. The relevance of reducing the dimension of the attribute space for the implementation of data processing and visualization without unnecessary difficulties is substantiated. The disadvantages of existing methods and algorithms for choosing an informative set of attributes are shown.
© The Authors, published by EDP Sciences, 2021
This is an Open Access article distributed under the terms of the Creative Commons Attribution License 4.0, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.