The application of traffic data in developing Chinese light duty driving test cycle

. Based on the research of the traffic flow model, generating the Vehicle Hours Travelled (VHT) and weight factors matrix of different cities in different speed phase by using the GIS whole road network data and traffic low-frequency dynamic data. Applying weighting factors to generate a unified speed - acceleration distribution of different cities in different speed phase with chi-square test to select candidate short trips, and to determine the test cycle duration of each phase in developing Chinese Light Duty Driving Test Cycle.


Introduction
The driving test cycle is the comprehensive reflection of the road environment, vehicle characteristics, traffic characteristics, climate characteristics, and other factors in a region, and it is the most important common basic technology in the automobile industry. China has been using the European driving test cycle for many years, which is seriously inconsistent with the actual operation of Chinese vehicles, therefore Chinese Light Duty Driving Test Cycle (CLTC) is called to be developed. There are mainly two kinds of data acquisition methods for the development of the driving test cycle, one is average traffic flow statistics and vehicle tracking method, this kind of method needs to plan the route and time of vehicles, and can not be called typical driving behavior. The other is the autonomous driving method, which has no stipulation in time and route, has strong arbitrariness, and can reflect the actual driving behavior [1] .
The development of the CLTC uses the autonomous driving method. To ensure that the CLTC can accurately represent the typical driving characteristics under the real road operation in China, the weighting factor matrix is used. Through the study of the traffic flow model, this paper obtains the weight factors matrix based on the speed-specific VHT(vehiclehours of travelled) distribution of each typical city in China based on a large number of floating vehicle data of the whole road network, which can be used to select representative segments and determine the phase duration when developing the CLTC.

Research on the traffic flow model
When using conventional traffic flow survey methods, such as manual counting method, automatic counting method, mobile vehicle method, and video recording method to obtain traffic flow data, the requirement of traffic flow measurement method and the collection cost is very high, and it is difficult to carry out a comprehensive traffic flow survey on a large scale. But the speed collection technology of the road network is mature. So based on the traffic flow model, the traffic flow data of the road network can be accurately obtained by the speed data.

Selection of the traffic flow model
The basic theory of traffic flow holds that there are the following basic relationships among traffic flow q, traffic speed u, and traffic flow density k.

= ·
(1) q denotes the traffic flow, u denotes speed, and k denotes traffic flow density in the formula. According to the above formula, based on the relationship between the speed and traffic flow density, and then the traffic flow can be calculated from the speed data.
The formula above also known as the Fundamental Diagram model , the speed, traffic flow, and traffic flow density, are three basic parameters to characterize the characteristics of steady-state traffic flow. The classical traffic flow models include the Greenshields model, the Greenberg model, the Underwood model, the Edie model, and the Van Aerde model so on.
The Greenshields model is simple and widely used, but it is not suitable for high-way and high-density traffic flow [2] . The Greenberg model can fit congested traffic flow well, but it is not suitable for free flow [3] . The Underwood model [4] overcomes the defect of the Greenberg model [5] that the speed tends to infinity in free flow, but it is not suitable for low speed and high density. What's more, the Van Aerde model is simple in structure, flexible, and easy to calibrate. Once proposed, many scholars have compared and verified the effectiveness of the Van Aerde model, and it has been widely used in different cities and roads at home and abroad. Based on the above research, this paper will use the Van Aerde model to calculate the traffic flow, the specific formula is as follows: c1, c2, c3 is the intermediate variable of the formula, and uf is the free flow speed (km/h), um is the critical speed (km/h), qmax is the traffic capacity (pcu/h), kj is the blocking density (pcu/km), u is the traffic flow speed (km/h), k is the traffic flow density (pcu/km).

Calibration of the Van Aerde model
From the formula (2), the traffic flow-speed relation function of the Van Aerde model can be obtained as follows: q is the traffic flow (pcu/h). From formula (2) and formula (3), it is known that the parameters that need to be calibrated in the Van Aerde model are free flow speed, the critical speed, blocking density and the capacity.

City classification
In the notice on adjusting the criteria for urban division [6] , cities can be divided into the ones shown in Tab 1. According to the number of people. This paper will study the flow-speed relationship model of traffic flow in different cities according to the size of urban population.

Calibration of the free flow speed
According to the road conditions provided by the road alignment and environment, The free flow speed refers to the vehicle speed that the driver drives freely without being disturbed by other vehicles and without considering the traffic control. The traditional free flow speed is obtained based on the definition of the free flow speed, using the measured data to establish a linear model (eg. the Greenshields model) between the speed and the traffic flow density, make the density in the model is zero to obtain the free flow speed.
But the number of measured data points near the free flow speed and blocking density area is often small, the use of speed-density regression can not guarantee that the type of regression curve used must meet the distribution of data points, so there are certain errors. Therefore, based on a large number of studies on free flow speed by scholars at home and abroad, the free flow speed in this paper will be determined by the percentile of speed data, the roads of all classes in this paper use 95% speed as the free flow speed.

Calibration of capacity,critical speed and the blocking density
The capacity refers to the maximum number of traffic entities that pass through a certain section of the road per unit time under the given service level. In the basic diagram of traffic flow, the critical speed refers to the corresponding traffic flow speed when the traffic flow reaches the maximum, and the blocking density refers to the density when the traffic flow is so dense that all vehicles can hardly move. With reference to "CJJ37-2012 Urban Road Engineering Design Code" and related literature, the capacity, critical speed and blocking density in the Van Aerde model are shown in Table 2, Table 3 and Table 4, respectively.

Validation of the Van Aerde model
To verify the feasibility of the model, it is necessary to compare the calculated results of the model with the actual statistical results. Take Beijing as an example. The data used for verification is the actual survey data, and the survey method is the video method. The camera is installed on safe roads and flyovers to photograph the driveway to collect data, to ensure that the survey area is covered within the shooting area, not only the moving position of the vehicle can be recorded, but also the type of vehicle and the actual traffic conditions can be photographed. After obtaining the video, the matching software is used to analyze the traffic flow parameters such as vehicle speed, traffic flow, and so on.
To verify the accuracy of the Van Aerde model, this paper will select one typical expressway, trunk road and secondary branch road, and compare the flow calculated by the whole lane model with the actual investigation flow.
The average value of relative error and absolute error of each hour traffic flow can be calculated by numerical analysis, the specific formula is as follows:  As can be seen from figure 1 above, the trend of the model calculated traffic flow is consistent with the actual investigated flow, and the difference is small. Calculated by formulas (4) and (5), the average absolute error between the calculated flow and the actual investigation flow on the expressway is 886pcu/h, the average relative error is 0.296.  As can be seen from figure 2 above, the trend of the model calculated traffic flow is consistent with the actual investigated flow, and the difference is small. Calculated by formulas (4) and (5), the average absolute error between the calculated flow and the actual investigation flow on the trunk road is 15pcu/h, the average relative error is 0.107.

Verification of traffic flow on secondary branch road
Compare the full-lane traffic flow calculated by the model with the actual investigation traffic on the secondary branch road, as shown in figure 3 below. As can be seen from figure 3 above, the trend of model calculated traffic flow is consistent with actual investigated flow, and the difference is small. The average absolute error between the calculated flow and actual investigation flow on the secondary branch road is 23pcu/h, and the average relative error is 0.119.
From the verification results of each class road, it can be seen that the Van Aerde model works well in different road classes, and the traffic flow can be obtained by speed.

Acquisition of original data
From the above results, to calculate the traffic flow, we need to obtain the whole road network data in the GIS(Geographic Information System) and the traffic low-frequency dynamic data. The traffic low-frequency dynamic data is obtained by a large number of the terminals installed on private cars, taxis, logistics vehicles, and chauffeured vehicles in 41 typical cities in China, which upload the speed, geographic location and time information every 5 minutes every day,And last for a year and a half. The routes of these cars can basically cover the city's whole road network.
The whole road network data include the name, start and end points, length, number of lanes, road class, etc of each road in the city.Take Beijing as an example, its road network data and traffic low-frequency dynamic data are shown in figures 4 and 5 below.

Selection of the weighting factors
There are two ways to express the weighting factors in the development of the driving test cycle, namely, VKT(Vehicle-kilometers of Travelled) and VHT(Vehicle-Hours of Travelled) . VHT refers to the product of the average traffic flow and the average travel time of vehicles on a road segment, reflecting the total travel time of all vehicles on the road segment, and VKT refers to the product of the average traffic flow and the length of the road segment [27] .
During the statistical period, the formula for calculating the VHT is as follows: During the statistical period, the formula for calculating the VKT is as follows: qi is the average traffic flow on the road i-section in the statistical period; is the length of the road i-section.
VKT and VHT can respectively reflect the traffic demand of the road network in a certain period from the perspective of the distance and time, but when there is traffic blocking [28] , the weighted method based on distance has some defects. For example, in the morning and evening peak hours, when the traffic flow decreases, the total vehicle mileage also decreases. If VKT is used as the weight factor of different speed intervals, the VKT of the low-speed range of the morning and evening peak will be small, resulting in a decrease in the share of vehicles in the low-speed range of road operation. On the contrary, at this time, the proportion of VHT of vehicles in the low-speed zone is often relatively large. It shows that VKT is not very sensitive to the actual vehicle movement level during peak hours, and using VKT as a weight factor may not reflect the proportion and share of low-speed blocking in the whole vehicle driving cycle.
At the same time, the VHT is suitable for data processing such as speed-acceleration distribution, and the development of driving rest cycle also has greater flexibility. The disadvantage is that the development of the VHT weight factor matrix needs the whole road network, full-time traffic dynamic big data. The development of the VKT weight factor matrix is simple, but it can not reflect the proportion of different speed intervals, so it is difficult to modify the test cycle. Suitable for data processing such as speed-acceleration distribution, and can reflect the characteristics different speed phase.
Simpler to develop the weighting matrix.
Dis-advantage Requires more resources to generate weighting factor matrix.
Inconsistent process to analyze idling periods and short trips.
To sum up, it is found that the driving test cycle based on VHT weighting can better reflect all the kinematic levels of the vehicle running on the road and the share of these levels, so this paper will use the VHT as the weighting factor when calculated the traffic flow.

Verification of VHT distribution on the expressway
Comparing the speed-specific VHT distribution obtained by the actual investigation and calculated by the Van Aerde model on the expressway , the results are shown in Figure 6 below. It can be seen from the figure 6 that the speed-specific distribution trend between the the actual investigation VHT and the calculated VHT is consistent. When the speed is between 30-36km/h, the actual investigation VHT is higher than the model calculation result. According to the analysis of the original data, it is found that the speed data collected from 9: 00-13: 00 in a corresponding time period is slightly lower than the investigated speed, and the speed data collected from 14: 00-18: 00 is slightly higher than the investigated speed. Therefore, there are some differences in the results. The correlation between calculation results and actual investigation results is 0.964.

Verification of VHT distribution on the secondary branch road
Comparing the speed-specific VHT distribution obtained by the actual investigation and calculated by the Van Aerde model on the secondary branch road, the results are shown in Figure 7 below. It can be seen from the figure 7 that the speed-specific distribution trend between the the actual investigation VHT and the calculated VHT is consistent. The correlation between calculation results and actual investigation results is 0.995.
According to the VHT verification results of different road classes, the calculated traffic flow results are is accurate.

Generate the weighting factor matrix
Through the verification results, it is known that the Van Aerde model works well in different road classes and the flow data can be calculated by the speed data. so this paper calculates the traffic flow of 41 cities, obtain the weight factor of each city in each phase, shown in Table 6, and the proportion of different phase of each city is shown in figure 8.

Develop the unified speed -acceleration distribution
According to speed -acceleration distribution of the different phase of each city and the weighting factors of each city, the unified speed -acceleration distribution in different phase is obtained. The candidate short trips selected according to the duration are freely combined. Generate the speed-acceleration distribution in each combination from candidate short trips,and compare with the unified speed-acceleration distribution by chi-square test. Select the short trip combination with the least p as the optimal solution . The chi-square test formula is as follows: is the speed-acceleration distribution value of i-short trip, is the value of the unified speed-acceleration distribution.

Determine the test cycle duration
Similar to the WLTC, WHDC and WMTC, the duration of CLTC is also set to 1800 seconds. The duration of 1800 seconds can not only satisfy the statistical representativeness, but also has the feasibility of emission test and fuel consumption test in the laboratory.
the duration of each phase in the CLTC is obtained by multiplying 1800 seconds by the weighting factor of each speed phase, the results are as follows: City phase : 1800*37.44%=674 seconds. Rural phase : 1800*38.50%=693 seconds. Motorway phase: 1800*24.06%=433 seconds.

Conclusion
According to the whole road network data in the GIS(Geographic Information System) and the traffic low-frequency dynamic data of 41 cities in China, and the research of the traffic flow model, generating the Vehicle Hours Travelled (VHT) and weighting factor matrix of different cities in different speed phase,The main results are as follows: (1) the development method of the driving test cycle based on the road network data and the traffic low-frequency dynamic data in this paper can extract the representative short trips and ensure that the driving test cycle can reflect the actual driving characteristics.
(2) the Van Aerde model calibrated in this paper has a good fitting effect for the traffic flow of different cities and different road classes, and can calculate the traffic flow accurately, which provides a good basis for the construction of the driving test cycle.