Test Method for Decision Planning of Autonomous Vehicles Based on DQN Algorithm

— In February 2020, Beijing, China andCalifornia, USA respectively released road test reports of 2019 for autonomous vehicles. Beijing and California respectively represent the highest level of testing and application of autonomous vehicles in the two countries. This article will compare the test items, evaluation criteria and technical defects of each autonomous vehicle company in the road test reports of China and the United States, also analyze the existing problems, and propose an idea for the construction of a comprehensive test site for autonomous vehicles. This article aims to solve the prominently exposed problems in decision-making and planning in autonomous vehicles with DQN algorithm-base vehicle fleet, and to look forward to the future development trend of autonomous driving testing.


INTRODUCTION
Autonomous driving technology is a product of the indepth integration of the automotive industry with newgeneration information technologies such as artificial intelligence, Internet of Things, and high-performance computing. Autonomous driving technology is not only the main direction for the development of intelligence and connectivity in the global automotive and transportation fields, but also has become a strategic commanding height for countries to compete for [1]. In response to different drive test requirements, academic circles at home and abroad have proposed different test methods. For the test of autonomous vehicles, it mainly includes software simulation test (the simulation test platform can replace part of the road test, and is complementary to the road test [2]), hardware-in-the-loop test, test scenario test and real vehicle road test. However, due to the lack of transparent and open field test data, the academic circle has not yet conducted a quantitative and comprehensive study on the factors and mechanisms of disengagement (automatic driving system failure, manual takeover [3]) of the autonomous vehicle.
This article analyzes the main technical problems faced by the autonomous driving industry by studying the road test reports of California, USA and Beijing, China, and develops special test ground construction solution for these problems. According to local laws, regulations, technical routes, etc., China and the United States have carried out localized construction of closed test sites and open road tests. Each test site has its own characteristics and advantages, but it still cannot meet the universality of the automated driving test method system. The requirements of modernization and high efficiency have not formed a complete technical standard and technical system [4].

The main problems reflected in the California drive test report
From the perspective of the problems in the two reports, many car companies' disengagement phenomenon occurs mainly in urban streets, as shown in Figure 1 below, although California's open areas cover highways, motorways, parking lots, country roads and street roads. There are many traffic scenes though, but various vehicle manufacturers still focus on three types of vehicle tests: highways, motorways and street roads. This reflects the current car companies' emphasis on urban street scenes. Autonomous driving systems also have problems that have been found in other autonomous systems, such as situational awareness errors, abnormal decision-making, and the reduction of human trust in them [5]. The reason for the company's disengagement in the California Drive Test Report in 2019 is shown in Figure 2. It can be seen that environmental perception and path planning are the most problematic categories to deal with [6]. Take Zoox in Silicon Valley as an example, the proportion of path planning leading to out-of-control accounts for 83.8% of the total problem proportion.
From the perspective of technical bottlenecks, the current technical difficulties faced by many companies focus on perception, prediction and decision-making deviations. The main reason is that the perception system is the window through which the car obtains environmental information [7]. However, the reliability of autonomous vehicles' perception of the surrounding environment has not yet reached the desired value [8]. The rankings of technical problems faced by different companies are also distinct. Frequent automatic disengagement of vehicles will lead the drivers no longer to trust in the vehicle and increase the possibility of manual control of the vehicle [9]. For example, the reasons for disengagement announced by Waymo include 54 times perception factors, 48 times planning factors, and 8 times hardware failures. Therefore, the main technical problems that Waymo currently encounters are ranked as follows: perception> path planning> hardware; Baidu announced 6 times Among the reasons for disengagement, hardware failures occurred for 3 times, perception factors 2 times, and path planning reasons 1 time. From this, it can be concluded that Baidu's current technical problems are ranked as follows: hardware>sensing>path planning.  Figure 3 shows the types and proportions of problems in the comprehensive ability test of each manufacturer in the Beijing Test Center. Similar to the California road test, the problems in the closed test site focus on perception, decision-making, and control problems [10,11]. The main reason is that the vehicle cannot complete the perception and tracking of the target in the simulated complex traffic scene, even after the correct decision is made, the vehicle cannot accurately execute the decision instruction. In addition, factors such as positioning error, sensor calibration error, imperfect algorithm redundancy system and ambient temperature also have a certain impact on the vehicle under test [12]. Due to the nature of the closed test site, its test process cannot be run around the clock like the open road test in California. Therefore, the total test mileage of the Beijing test site is lower than the California open road test mileage [13], the number of open roads and The mileage is also less than California. Figure 4 shows the public road test mileage of each company in 2018-2019.

solution for the construction of a test site for autonomous driving decision-making and planning capabilities
Through the above comparative analysis, it can be found that the main technical problems faced by the autonomous driving industries in China and the United States are concentrated in the shortcomings of the decision-making and planning capabilities of autonomous vehicles. California's drive test does not have a special test specifically for decision-making planning [14]. The Beijing test site only has a single test of perception ability, and lacks test items that combine decision-making and planning. In this regard, the author proposes a construction idea of a comprehensive test field that highlights the decision-making and planning of testing autonomous vehicles. Its main feature is to set up a multivehicle test system based on the DQN algorithm to verify the actual intelligence of the vehicle.
With the widespread application of collaborative decision-making control of multi-agent systems, the mode of multi-agent confrontation has become a new idea to solve the problem of special testing of decision-making planning. The comprehensive intelligence test field integrates a variety of common traffic roads and traffic facilities (including straight sections, intersections, roundabouts, high-speed entrances and exits, T-junctions, ramps), as shown in Figure 5. Each scene in the test can be used as a separate test item. During the test, multiple test background cars form a multi-agent cluster fleet to conduct a confrontation test with the tested vehicle to test the rationality of the decision made by the tested vehicle when it conflicts with their own driving purpose.
When taking a three-lane straight road, there are six vehicles around the tested vehicle as an example of test conditions, as shown in Figure 6. During the test, the goal of the tested vehicle is to reach the end of the test area from the start of the straight test area, and six test background vehicles are arranged around the tested vehicle as dynamic obstacles. The decision model of each test background car uses the DQN network. The goal of DQN is to train an excellent neural network. The input of the neural network is s (the current state of the vehicle), and the output is n Q values (n represents The number of selectable actions, the output is actually an n-dimensional vector). In this way, the agent can input the current state into the neural network before each choice, and then select the action corresponding to the largest Q value in the output according to the output of the neural network. Then the agent enters the next state, inputs the next state into the neural network, selects the next action according to the output of the neural network, and so on, until a stable decision-making network is generated.
The value function of this model is: when the test background vehicle sensing system detects that the tested vehicle is in a certain range directly in front of or behind it, the two vehicles simultaneously begin to adopt the strategy of moving close together to the tested vehicle. The test background vehicle began to accelerate, forming a pinching situation against the vehicle under test. When the FCWS (Front collision warning system) of the vehicle under test or the FCWS of the following vehicle alarm, the vehicle under test will fail the test without changing lanes. When the test background vehicle changes lanes to adjacent lanes, the two vehicles in the adjacent lanes also start to adopt a pinch strategy against the tested vehicle. Six vehicles form a multi-agent cluster with each other. When the vehicle in front or behind the vehicle under test detects that the vehicle under test starts to change lanes, the adjacent lanes first send a signal to inform the adjacent lanes that the vehicle in front and the vehicle behind are pulled apart distance so that the vehicle under test can drive in. Throughout the straight road test, the multivehicle formation continued to cooperate with the tested vehicle to complete the test, and when it reached the end of the test, there was no collision before entering the next test item.
The tests at intersections, roundabouts, high-speed entrances and exits, T-junctions, and ramps are similar to straight roads. The intelligence of vehicles is investigated by using multi-agent formations against the main vehicle.
The items in the test field are all single-shot continuity test items, that is, after the test starts, the vehicle should complete all the items set in one go. When a safety officer takes over or other problems occur during the test, the test is automatically terminated and the result is no qualified. But at the same time, it should be noted that because the complex interaction process between multi-agents is affected by many uncertain factors, the realization of macro-optimization of multi-agent vehicle systems has become the core difficulty of the comprehensive intelligence test field test method [16,17].

Autonomous driving decision-making and planning capability test site architecture
The test field consists of a test field, a control center, and a multi-agent cluster fleet, as shown in Figure 7. The infrastructure such as traffic lights and cameras in the test site shall have the function of the IOT in order to upload environmental information and its own state data in the test site. The roadside equipment in the test site should meet the data transmission rate and capacity requirements between the equipment in the site. The main communication facilities used are the complementary equipment combination of DSRC and LTE-V. DSRC is mainly used for short-range workshop communication. LTE-V equipment is required for long-distance workshop communication or interaction between vehicles and peripheral equipment.
The internal facilities of the test field and the multiagent cluster fleet are controlled by the control center. The main function of the control center is to serve as the data center of the entire test field and provide the release and modification of high-precision maps in the field. At the same time, there is also a virtual test center based on the actual test field in the control center for virtual simulation testing or hardware-in-the-loop testing. Each test vehicle of the multi-agent cluster fleet is loaded with a chassis with a variable frame, which changes the test vehicle to act as different traffic test elements, such as trucks, cars, bicycles, etc. During the test, the test vehicle can achieve the purpose of imitating different traffic elements by changing the traffic role and adjusting the corresponding driving speed. By changing the color of traffic elements or simulating reflections and luminous bodies, the effectiveness and safety of the recognition of autonomous vehicles (vehicles under test) are detected. The anticollision air cushion is installed around the chassis to ensure the anti-collision performance of the test vehicle.
Part of the route planning tasks in the fleet are undertaken by the control center, which is also responsible for monitoring the operation of the facilities in the entire test site. Each vehicle in the multi-agent cluster fleet composed of multiple intelligent test vehicles has basic L2 level autonomous driving capabilities (functions include: ACC+LKA+LCDA+AEB). The perception system, positioning system, decision-making system and control system composed of the sensors and computing equipment in the vehicle should be tested in the test field to verify its own reliability and intelligence. The test site should also include a variety of environmental simulation test supporting equipment, including fog generators, sprinklers, etc. Environmental simulation test supporting equipment is controlled by the control center to match different environmental factors, as much as possible to provide complete and changeable road conditions for the tested vehicle.

Evaluation method of autonomous vehicle capability
In consideration of the safety performance of autonomous vehicles, although the road test environments in Beijing and California are simulated or performed as real traffic scenes [18], the capabilities of autonomous vehicles cannot be fully verified. The current important problems of autonomous vehicles evaluation are mainly reflected in two aspects. One is how to ensure the scientificity and operability of autonomous vehicles intelligence test and evaluation, and the other is how to solve the problem of market-oriented self-driving test standard acceptance [19,20]. The evaluation of the overall performance of the vehicle, as shown in Figure 8, mainly focuses on the four aspects of efficiency, comfort, intelligent safety and economy. In the evaluation system, the evaluation target and index system should be determined first, and then the index type uniformity processing and the index dimensionless processing should be carried out, then the index weight should be determined, and finally the evaluation method should be selected. Not only that, but also should limit the external conditions of the traffic scene aggregate, such as extreme weather conditions or road accidents.

CONCLUSIONS AND PROSPECTS
Through the analysis of the China-US autonomous vehicle evaluation report, it can be seen that although the autonomous test research in the United States has been carried out earlier, it is still difficult to propose a universal and effective test evaluation system [21]. In the closed test field in Beijing, although the scene design is rich and the test methods are diverse, the evaluation criteria of the test results are still incomplete, and it takes a lot of manpower and time to simulate the complex environment of adjacent static and moving entities [22]. Under the current technical system and industrial structure, due to the high degree of safety and quantification of the closed test field, it will still be an indispensable part of the autonomous driving test field for some time to come, but the specific setting of the test field is required as well as the further improvements. Similar to the diversity of the autonomous driving technology system itself, the vehicle capability passing standards and test regulations for autonomous driving tests are also in a state of blooming, but the authoritative test standards for vehicle intelligence are still in a state of vacancy. The solution of the autonomous driving test field cannot fully meet the current market demand. In order to solve the above shortcomings, we will set the basic structure of the test field and test item details based on the improvement of the capabilities of the autonomous vehicle with DQN netwrok, absorb the advanced experience of the test field at home and abroad, and combine the author's ideas to build an efficient, scientific and rigorous Test evaluation system.