Abstract:Aiming at the problems of slow model training speed, large amount of computation and untimely response of quadrotor UAV vision obstacle avoidance system based on deep reinforcement learning, a lightweight and fast model training system is designed. The system first takes the depth image and the UAV′s own state information as input, and then uses a GRU structure-based A3C algorithm (GRU-A3C) to output continuous action space and combine the curriculum learning method for training acceleration. Finally, A3C was used as the baseline for ablation experiments. The experimental results are as follows: after 1 000 rounds of training, the success rate of GRU-A3C algorithm trained using curriculum learning method is 0.28, and the success rate of A3C algorithm is 0.2. After 5 000 rounds of training, the success rate of GRU-A3C algorithm trained using curriculum learning method was 0.72, and the success rate of A3C algorithm was 0.62. The data show that this system can effectively accelerate the model convergence speed, shorten the training time and improve the training effect.