Indicative assessment method of the public perception of environmental marketing ideas

. Natural Language Processing is a machine learning method based on mathematical linguistics that can identify trends in public opinion. The article analyzes the possibility of implementing LDA and NLP methods to identify the growing public interest to the problems of ecology and environmental conservation. It will provide the basis for manufacturers to make eco-marketing decisions. Reorienting production towards creating green goods, introducing new ecological products to the market, promoting energy-saving technologies requires significant investments. To get a return, it is required to capture the steady demand of contractors and consumers for ecologization. The article offers a comparative analysis of getting information by classical methods (for example, through surveys) and machine learning methods. The most important sources of data collection are highlighted on the basis of their popularity, public attention and the number of individuals participating in the discourse. The author has developed key categories and keywords with which the Russian society associates the perception of environmental marketing. The result of Natural Language Processing is presented to assess public perception of ecological marketing ideas.


Introduction
Environmental problems in the world have different nature: depletion of natural resources, environmental pollution as a result of industrial activities, storage of various types of waste, etc. WHO notes the connection between environmental pollution and deteriorating indicators of public health. Due to the negative environmental situation, morbidity and mortality from cardiovascular and oncological diseases are increasing.
Accordingly, there is a public inquiry aimed to protect and clean the environment, and to consume goods and services that meet environmental requirements. There is also a discourse about the individual's personal environmental responsibility. There are environmental projects of various levels and directions starting from local exchange initiatives, regional water treatment projects to large all-Russian projects.
Ideas for separate waste collection, rejection of disposable goods, where possible, reuse of things and packaging, responsible disposal of end-of-life goods create a fertile ground for the development of environmental marketing, which is based on the principle of satisfying consumers' interests by promoting environmentally friendly goods and services.
Environmental marketing is a complex system. Therefore, to bring a product to the market, plan sales, develop proposals for product promotion, etc., it is necessary to take into account the peculiarities of the society's perception of the environmental marketing. It may differ in different target groups and need to be taken into account.
The production of environmentally friendly products requires significant investments from the consumer: transition to environmentally friendly types of energy, installation of treatment facilities, control of harmful emissions, acceptance for disposal of products that have been discontinued, etc. As for food products, the "green" products require more expensive storage and transportation conditions and have a shorter implementation period.
Accordingly, environmentally friendly goods and eco products have a higher cost. Organic food products are good for health and, as a rule, have better taste characteristics. The consumer immediately feel satisfaction. However, budget constraints also influence consumer choice.
Unlike food, environmentally friendly household goods do not have the immediate effect of their consumption. Here a buyer should be sure that buying a more expensive ecofriendly product with similar properties contributes to the preservation of the ecology. Therefore, a manufacturer should catch a stable trend for eco-consumption that has arisen and has become entrenched among a certain group of society. As a rule, such a trend arises and finds a response among active Internet users, it is possible to track an increase in statements on this topic.
Identifying factors of consumer readiness for the perception of environmentally friendly products allow to predict the growth of demand, organizing an advertising campaign in time and saturating the market with appropriate eco-friendly products, changing packaging, conducting a recycling campaign, increasing not only the company's income, but also customer loyalty to the brand.
Among the main approaches to measuring the socio-economic expectations of the population (including the factors of acceptance of marketing campaigns), two are currently used: analysis using surveys of socio-economic agents and analysis of high-frequency stock indicators, taking into account the totality of forecasts of financial market participants regarding various economic, social and political factors.
Indicators of public perception of the environmental marketing, based on conducting opinion polls, with the target audience of households, organizations and experts, are valuable both for making objective forecasts of the dynamics of key macroeconomic indicators in future periods, and for predicting further social and economic decisions of the population. In the analysis of research results quantitative estimates of expectations (for example, the proportion of respondents expecting a change in the dynamics of the target parameter, or its median estimates) and an assessment of the uncertainty of expectations in the respondents' answers (the following indicators are used: the proportion of respondents who are not ready to answer the question, the variance of the respondents' answers) are of great importance. However, when using this method, it is necessary to highlight its disadvantages: -significant financial and organizational costs for conducting surveys; -instability of respondents' answers, depending on the question, properties of the sample, methods of identifying outliers in the collected data; -low frequency of obtaining survey results. When using methods for measuring public expectations based on data from financial markets, the main disadvantages of methods based on the survey of the population are not seen: data on transactions with financial instruments is available in real time, it does not require significant financial costs for interviews, and interpretation of values indicators, as a rule, do not vary significantly due to the unambiguous certainty of the algorithms for their construction. However, the method has some negative features: -it is focused on a small group of experts participating in financial markets, so, the representativeness of the study is limited; -while constructing perception indicators based on risk assessment when trading financial instruments, the researcher should take into account that changes in financial indicators only indirectly reflect specific areas of socio-economic expectations of the population, and should add coefficients of expert judgment to the calculation of the final indicator; -the data obtained using the analysis of financial markets reflect the trajectories of expectations that are close to the real ones only in the case of sufficiently large trading volumes for the selected financial instruments.
Thus, it is necessary to develop and study the features how to apply fundamentally new methods of identifying population needs, which will not have the key disadvantages. The authors of the article propose a method for constructing an indicator of public perception of environmental marketing ideas based on modern methods of machine learning and processing large amounts of text information.
Natural language processing is a field of knowledge that emerged at the intersection of machine learning and mathematical linguistics, aimed to study methods for analyzing natural language. NLP methods have recently found application in many areas of knowledge, including research related to the construction of economic indicators of various orientations.
Let us highlight the works on assessing the impact of news on the dynamics of economic, social and political expectations, which are based on natural language processing methods.
Bauer [2] shows the significant influence of macroeconomic news on the dynamics of inflationary expectations. Drager, Laml and Pfajfar [3] assess the impact of the information background on the aggregated results of monthly surveys of the population regarding their economic expectations. The information background is measured by the number of news on monetary policy and key macroeconomic events and trends.
There is also a number of researches that aim to use natural language processing and machine learning methods to create economic indicators.
One of the first articles in this area was the research of Baker, Bloom and Davis [4]. A frequency-based text processing method was applied on the basis of economic articles of a selected number of large media outlets in order to build uncertainty indices of the general economic situation and the political situation for different countries. The approach to constructing the index was based on calculating the frequency of occurrence of the word "uncertainty" in the database of economic news texts compiled by the authors. The final indicators turned out to be significantly correlated with the main macroeconomic indicators of the countries under consideration. For example, the uncertainty index increases during periods of economic crises and significant political events (approaching elections, etc.). Moreover, at the micro level, there is a significant relationship between the dynamics of the constructed indicators and the prices of shares of enterprises, their investment decisions and labor market policies.
In order to predict the parameters of business cycles, Torsrud [5] uses indices of the probability of occurrence of the main topics of economic news, identified with help of the classical machine learning approach to probabilistic thematic modeling of text information -Latent Dirichlet Allocation (LDA). Larsen and Thorsrud [6][7][8] show that the indices of the probability of occurrence of certain topics are statistically significant for improving the prediction of the economic environment.
Among Russian authors, it is necessary to point out Yakovleva [8], who uses the Torsrud algorithm [5] with elements of the approach of Goloshchapova and Andreev [8] and her own modifications on the news data of one of the major news media in order to predict the PMI index for Russia. In particular, similar to the approach of Goloshchapova and Andreev [9], the author tries to determine the emotional coloring of news and uses the method of working with a teacher using a pre-marked sample of texts and sampling in terms of probability in order to identify the class of "neutral" news.

Material and methods
The proposed method is based on the hypothesis that the intensity of discussion by readers of articles and blogs that raise topics of ecology, pollution, respect for the environment will indicate the intensity of socio-economic expectations, concern about potential future events, the perception of environmental measures, possible participation in events aimed at environmental protection.
The main sources of information for the proposed method of indicative assessment in the Russian Federation are news and articles in media, users' comments, a selection of news aggregators on topics related to the perception of environmental problems by the population. To identify the largest and most significant publications, we used the open ratings of the Medialogia company, to assess the most significant events in the media -the news aggregators Yandex.News, Google News, to include a wide range of opinions in the rating -Yandex.Blogs.
While selecting articles, we have taken into account blogs and comments for further analysis and publications relevant to the research topic, that is, only where the object of perception is discussed. For this, we carried out an expert analysis of the content of target publications and chose 11 categories of factors and keywords associated with them ( Table  1). The public perception of environmental marketing ideas is closely connected with them. In the study, we took a three-year period for constructing indicators: from January 2018 to December 2020, which made it possible to highlight both seasonal fluctuations and longterm trends. Discreteness of indicators updating was weekly. At the same time, it should be noted that the method does not imply a strict fixed date for the study and implies regular updates to regularly monitor the analyzed indicator, including a greater frequency, if necessary for making management decisions.
We have chosen topic modeling as a machine learning approach for natural language processing, which is one of the classic and well-developed ways for automated work with significant amounts of text data in order to highlight clusters of keywords and phrases in the analyzed data volume.
One of the most frequently used and developed topic modeling algorithms is the LDA -Latent Dirichlet Allocation method. As shown in [1], applying the LDA algorithm has the following key advantages: -it is possible for sufficiently large collections of documents, their expert processing has significant difficulties; -researchers obtain a reliable and reproducible classification of topics for textual data, trained on a certain set of texts it can be applied to any collection of documents, including new texts; -it implies that topics and their probabilistic relationships with keywords are determined by searching for the optimal statistical model in a given corpus of text information.
Before carrying out the procedure of thematic modeling, text documents must be preprocessed. This process has the following steps: -basic text filtering, that is, converting alphabetic characters to lower case, removing non-word characters from texts that are important for determining topics. Typically, extra characters include: non-linguistic characters, non-alphanumeric characters, punctuation marks, repeated spaces, single-letter characters; -tokenization or division of continuous text information of documents into separate words; -removal of stop words or words that do not carry a significant semantic load to determine topics, lists of classic stop words for the Russian language are regularly updated by linguists and are publicly available. Such lists include prepositions, introductory constructions, conjunctions, pronouns, evaluative definitions and adverbs, etc.
-stemming or reducing each word to its original form. Based on the results of thematic modeling, the ranking of the resulting set of topics or the selection of topics that are most important for the indicator of public perception of environmental marketing ideas is carried out on the basis of the final probability of each topic appearing in different periods of time.
The calculation of the indicator of public perception of environmental marketing ideas was made as the ratio of the number of comments, filtered as showing readiness to accept, to the total number of text documents with the mention of environmental categories for each time period.

Results
The result of calculating the indicator of the intensity of inflationary expectations was produced as the ratio of the number of comments filtered as mentioning environmental issues to the total number of media articles mentioning the growth of interest in the environment, similar to those in the comments for each time period. For the selection of media articles and comments of Internet users, the rules described above were applied to them. Weekly data was used as the main periodization to plot the indicator dynamics over time. (fig. 1).   Fig. 1. A weekly indicator of public acceptance of green marketing ideas from January 2018 to December 2020, based on text data from articles and blogs.
During the entire period under review, the Russian population has demonstrated a surge of environmental discussions mainly in connection with: environmental disasters in the Russian Federation (32% of messages from the total number of messages classified in the selected categories), garbage collapse (21%), an increase in utility tariffs for the removal of solid waste (17%), the opening of new landfills in the regions (13%), global environmental disasters (11%).
In addition, there has been a reaction to social environmental projects (10%), environmentally responsible behavior of public people (8%), information on climate change and global warming (5%).
At the same time, the Russian population has shown the least reaction (only 3%) to messages about saving water and energy resources and the successful processing of garbage in Western countries (less than 2%).

Discussion
It is necessary to draw conclusions about the indicators by which the Russian population demonstrates the growth of environmental responsibility and interest in environmentally responsible consumption: -environmental disasters that occurred for various reasons: the growth and subsequent long-term interest to the topic of ecology at the beginning of 2019 was associated with the fall of black snow in Kuzbass due to the increase in emissions from Kuzbass concentration plants and the emergence of information about the consequences of the spill of radioactive waste in Yakutia at the uranium mine "Dalura"; public interest in ecology in the fall of 2020, despite the general vector of public interest in the Covid-19 pandemic, was associated with the pollution of ocean waters in Kamchatka. The leakage of toxic substances (diesel fuel) in Norilsk in May 2020 was expressed by an insignificant peak on the graph, since significant efforts were made to reduce public interest to this topic, besides, summer in the Russian Federation is a traditional period of informational lull; -municipal waste processing and garbage disposal: the peaks on the graph reflect the reaction to the outbreak of protests against the construction of a landfill at st. Shies in the Arkhangelsk region for storing solid household and industrial waste exported from Moscow. There was also a noticeable peak in response to publications about the "garbage riot" in February 2018 in Volokolamsk, Moscow region, in connection with rallies against the overflowing garbage landfill "Yadrovo"; -local environmental activity: it is impossible to distinguish peaks associated with this indicator on the general graph, its elaboration requires further research including a regional component.

Conclusion
The developed methodology for constructing high-frequency indicators of the growth of public interest of the Russian Federation to environmental problems makes it possible to make a decision on the activation of eco-marketing by economic entities.
A key possible direction for the further study is applying the developed methodology for constructing high-frequency indicators of the growth of citizens' environmental consciousness and their readiness to consume eco-friendly goods at the regional level. A surge of interest to the environment is provoked by regional environmental projects and the activities of local eco-activists, regions of the Russian Federation have a number of specific features that require the development of separate approaches to environmental marketing.
One of the key and problematic stages of applying the methodology to the regions of the Russian Federation is the identification of information sources relevant for a particular country and the assessment of the representativeness of the sample for analysis. When choosing sources of news articles, a researcher needs to analyze the popularity ratings of media sources for each region based on media ratings and take into account the profile of popular sources using surveys of the local expert community. At the same time, the choice of sources for studying the comments of Internet users includes messages to articles on the pages of the official media websites and comments in social networks. It should be underlined that the social network VKontakte, popular in the Russian Federation, contains both official and spontaneous news groups, while it is also the main information field for eco-activists and local initiatives.