Research on Task Pricing Strategy of "Taking Photos to Make Money" APP

"Taking photos to make money" APP is a kind of self-service mode under the mobile Internet mode, so it is of practical significance to study the task pricing for the later improvement of service. Based on this, this paper firstly determines the main indexes and calculation methods that may affect task pricing, then uses the one-way analysis of variance to determine the significant differences caused by each index variable on task pricing, and then uses multiple regression analysis to consider the influence of multiple factors on task pricing. As a result, the index variable should be removed. Then the relationship between the other three indicator variables and task pricing is analyzed. And then you get the equation. Finally, the unfinished data in the original data were selected, and the influence of index variables on task pricing was observed by using multiple regression analysis. The reasons for unfinished tasks are analyzed mainly because the shortest distance between members and tasks is not considered.


Introduction
With the continuous development of Internet technology, "taking photos to make money" has become a self-service model under the mobile Internet. This self-service crowdsourcing platform based on mobile Internet greatly saves survey costs, effectively guarantees the authenticity of survey data and shortens the survey cycle [1] . In this service mode, task pricing is the core of the platform. If the pricing is not reasonable, some tasks will be neglected, resulting in the failure of commodity inspection.

Determination of main indicator variables and its calculation method
In order to study the task pricing rule of the platform; Firstly, this paper determined the dependent variable Y as the task pricing, and selected some dependent variables with statistical relationship with Y according to the original data, namely: task density X1, member density X2, member credibility X3, and the shortest distance between member and task X4. X1 is similar to the selection of X3. The task point is selected as the center to build a circle with radius r. MATLAB is used to calculate the number of tasks mi in the region and the sum of the member's credibility in the region. Take the number of tasks as its task density MDi and member credibility hi, that is: MDi=mi.This paper take a task point as the center point, construct a circle with radius r, calculate the number of members in the area by MATLAB, and divide by the corresponding area, namely:

Table1. Index variable scale
Where, ni is the task number and the total number of peripheral members of the Ai, s is the area of the surrounding area that is determined with the membership as the center. In addition, this paper can directly use ni represent the membership density of task number Ai. Subsequently, this paper found that the radius r=5km was the most reasonable. The shortest distance between the member and the task is calculated according to the Euclidean distance algorithm. The specific distance conversion formula is as follows: x1m is the latitude of task Am, y1m is the longitude of task Am, x2n is the latitude of member Bn, y2n is the longitude of member Bn.
Therefore, the distance of task Am is: dm=min{dmn}.

The principle and analysis of onefactor analysis of variance
In essence, one-way anova USES the processing method of statistical inference to calculate the F statistic and conduct the F test [3] . Where, the total sum of variance squares is denoted as SST and decomposed into two parts: One part is the deviation caused by four index variables, denoted as SSA, The other part is SSE caused by the inside of a single index variable. There is: SST = SSA + SSE (2) Including: Thus it can be seen that the sum of sample dispersion squares between groups is the sum of squares between the mean of the number of index variables and the mean of the population, which reflects the influence of different index variables on price [4] [5] . By: It can be seen that the sum of squared deviations in the group under a single indicator variable is the sum of squared deviations between each data and the average value of the horizontal group, reflecting the influence of a single indicator variable.
F statistics is the ratio of the average sum of squares between groups to the average sum of squares within groups, and the calculation formula is: From the formula of F value calculation, it can be seen that if different levels of control variables have a significant impact on task pricing, then the sum of squares of inter-group dispersion of indicator variables must be large, and the F value will be relatively large [6] . On the contrary, if the different levels of control variables do not significantly affect the indicator variables, then the influence of the sum of squares of deviations within a single indicator variable group will be relatively large and the F value will be relatively small [2] .
The following is the result file output by SPSS: Table2. One-way anova table of task density and task price Table3. Single factor anova table of credit accumulation and task price Table4. Single factor anova table of credit accumulation and task price Through the calculation by SPSS, it can be seen that the relationship between the four variables and the task price is significant, and the four variables are the influencing factors of the task price.

Construction and processing results of multiple regression analysis model
Since task price Y may be affected by task density X1, member density X2, member credibility X3, shortest distance between member and task X4, etc., this paper need to further discuss the parameter estimation and goodness of fit evaluation of the model.
Since the random error term should satisfy the Gauss-Markov hypothesis, the multivariate linear regression model of Y and Xi can be constructed.
The multiple linear regression model was built by RStudio, and the linear regression equation of the overall level Y for the four independent variables was obtained as: y=73.87-0.1531x1-0.04361x2-3.078x10 -5 x3-0.001842x4 (6) RStudio [7] is used for significance test analysis, and the running results are shown in figure 1:

Fig1. Results of the significance test of global multiple linear regression
The result can be obtained: the corresponding P value of the variable is 2.13*10-15, 3.58*10-10, 0.00191, 1.54*10-5. It can be seen that the correlation between the member's credibility and the task price is weak, the shortest distance between member and the task has a strong correlation with the task price, and the task density and the member density have a strong correlation with the task price. The value of F statistic is 99.22, and the value of P < 2.2*10-16 < 0.05, so it can be judged that the established regression equation is significantly effective.
Since the significance of the member's credit degree X3 is relatively weak compared with other factors, a mathematical model is established when the member's credit degree is deleted: . RStudio software is used for significance test analysis, and the running results are shown in figure 2:

Fig2. Results of multivariate linear regression significance test after dimension reduction treatment
Calculated by RStudio software, the linear regression equation of overall horizontal Y to 3 independent variables is as follows: y = 73.903953 -0.159596x1 -0.049827x2 -0.018505x4 (7) Satisfies the significance test.
In order to find out the reasons for the unfinished task, the completion of the task was divided into those who completed the task and those who did not.
①When considering the completion of the task, the multiple linear regression model of Y and Xi is also constructed [8] .
The significance test analysis was carried out with R software, and the operation results were shown in figure 3:

Fig3. Results of multivariate linear regression significance test
at the completion of the task Calculated by RStudio software, the linear regression equation of the overall horizontal Y to the four independent variables is: y = 74.16 -0.1786x1 -0.04124x2 -0.01258x4 (8) Satisfies the significance test. As can be seen from the output result, the corresponding P value of the variable 1 ， 2 ， 3 ， 4 is 9.82*10 -11 , 0.000387, 0.006267 and 0.049360.It can be seen that the member's credibility and the shortest distance between the member and the task are less significant to the task price, while the task density and the member density are more significant to the task price. The value of F statistic is 59.63 and the value of P < 2.2*10 -16 < 0.05, so the established regression equation is significantly effective.
Since the significance of member's credit degree is relatively weak compared with other factors, the linear regression equation of overall level Y to 3 independent variables can be obtained when member's credit degree is deleted as follows: y = 74.283745 -0.184611x1 -0.050587x2 -0.014579x4 (9) Satisfies the significance test. Similar to the above, the shortest distance between the member and the task with the least significant factor was deleted, and the linear regression equation of the overall level Y to the two independent variables was obtained as: 1 2 73.42451 0.17140 0.04943 Satisfies the significance test. ②The incomplete data were selected and the multiple linear regression model was established by R software. Thus, the linear regression equation of the overall level Y to the four independent variables was obtained as: y = 72.36-0.1089x1-0.0367x2-0.00002441x3-0.01638x4 (11) The significance test analysis was carried out with R software, and the operation results were shown in FIG. 4:   Fig4. Results of multivariate linear regression significance test analysis before completion of the task From the output results, it can be seen that: the corresponding P value of the variable 1 ， 2 ， 3 ， 4 is 3.31*10 -5 , 9.97*10 -6 , 0.04202, 0.00354, the value of the F statistic is 31.36, and the P value < 2.2*10 -16 < 0.05, it can be considered that the established regression equation is significantly effective.
Then, in the case of deleting the least significant member's credibility and the shortest distance between member and task, the linear regression equation is: y=72.329440-0.113786x1-0.041635x2-0.015379x4 (12) y = 71.039164 -0.069950x1 -0.047553x2 (13) All of them meet the significance test. Based on the above analysis, it is concluded that under the overall situation, there is a significant relationship between task pricing and task density, member density, and the shortest distance between member and task, using the functional equation: y=73.903953-0.159596x1-0.049827x2-0.018505x4 (14) to describe the law of task pricing. By observing the information of unfinished tasks, it can be seen that the unfinished tasks are concentrated in the tasks with low price, which is closely related to the credibility of members and the shortest distance between members and tasks. After the fast division of tasks, it can be seen that there is a significant relationship between price and task density and member density, while the relationship between price and member credibility and the shortest distance between member and task is weaker.
For this reason, the functional relationship satisfying this kind of price is found by using the multiple regression model respectively in the case of task completion and task not completed: Finished task: y=73.42451 -0.17140x1 -0.04943x2 (15) Unfinished task: y=71.039164-0.069950x1-0.047553x2 (16) The comparison between the two functions shows that the two curves are very close, and the task price is related to the task density. The member density has a significant relationship, and considering the influence of four factors, the incomplete task is related to the shortest distance between the member and the task in the overall situation, and the impact is large. The member considers the location of the task, resulting in the unfinished task.

Conclusions
In order to study the pricing law of items in the original data, we firstly determine the main indexes that may affect the task pricing and find out the calculation method of the index variables. Then, MATLAB was used to calculate the results of each indicator variable, and task pricing constituted a one-to-one correspondence. The significant differences and changes caused by each index variable to the task price were determined by using one-way analysis of variance. Then the model was improved and the effect of multiple factors on task pricing was considered by multiple regression analysis. From this, it is concluded that there is little correlation between task pricing and the member's credit rating of the indicator variable. After eliminating the member's credit rating, the relationship between the other three indicator variables and task pricing is analyzed and the equation is obtained. Then select the unfinished data in the original data and observe the influence of index variables on task pricing by using multiple regression analysis. The reasons for unfinished tasks are analyzed mainly because the shortest distance between members and tasks is not considered.