|Table of Contents|

Citation:
 Shanqiang Li,Chaoxi Li.Path Planning for Unmanned Surface Vehicles in Dynamic Environments Based on Artificial Potential Field and Global Guided Reinforcement Learning[J].Journal of Marine Science and Application,2026,(2):575-586.[doi:10.1007/s11804-025-00697-2]
Click and Copy

Path Planning for Unmanned Surface Vehicles in Dynamic Environments Based on Artificial Potential Field and Global Guided Reinforcement Learning

Info

Title:
Path Planning for Unmanned Surface Vehicles in Dynamic Environments Based on Artificial Potential Field and Global Guided Reinforcement Learning
Author(s):
Shanqiang Li Chaoxi Li
Affilations:
Author(s):
Shanqiang Li Chaoxi Li
School of Computer Science and Mathematics, Fujian University of Technology, Fuzhou, 350118, China
Keywords:
Deep reinforcement learning|Path planning|Unmanned surface vehicles|Fast guided deep Q-Network algorithm
分类号:
-
DOI:
10.1007/s11804-025-00697-2
Abstract:
For unmanned surface vehicles (USVs), how to find an effective, feasible path that substantially improves mission success rates and time efficiency in dynamic marine environments is a critical issue. To address the path planning problem for USVs using deep reinforcement learning (DRL) in dynamic ocean environments, an improved algorithm based on Deep Q-Networks (DQN) is proposed, which is called Fast Guided Deep Q-Network Algorithm (FG-DQN). This algorithm combines DQN with the artificial potential field (APF) method and uses the A* algorithm to initialize a guiding path in a global static environment and to provide prior knowledge for the USVs. Additionally, the configuration of the reward function using APF and the guiding path effectively reduces the frequency of random movements during the early exploration phase of the DQN algorithm, which accelerates convergence, improves the computational efficiency of path planning, and increases path safety. Finally, the performance of the presented algorithm is validated through experiments in a 2D environment. Compared with traditional reinforcement learning methods such as Q-learning and Sarsa, as well as the original DQN algorithm, FG-DQN is more effective for USV path planning.

References:

[1] Cheng C, Sha Q, He B, Li G (2021) Path planning and obstacle avoidance for AUV: A review. Ocean Engineering 235: 109355. https://doi.org/10.1016/j.oceaneng.2021.109355
[2] Cai K, Wang C, Cheng J, De Silva CW, Meng MQH (2020) Mobile robot path planning in dynamic environments: A survey. Instrumentation 6(2): 90-100. https://doi.org/10.48550/arXiv.2006.14195
[3] Cobb HG, Grefenstette JJ (1993) Genetic algorithms for tracking changing environments. Proceedings of the 5th International Conference on Genetic Algorithms, Urbana-Champaign, USA, 523-530
[4] Colorni A, Dorigo M, Maniezzo V (1991) Distributed optimization by ant colonies. Proceedings of the First European Conference on Artificial Life, Vol. 142, Paris, France, 134-142
[5] Dechter R, Pearl J (1985) Generalized best-first search strategies and the optimality of A*. Journal of the ACM (JACM) 32(3): 505-536. https://doi.org/10.1145/3828.3830
[6] Ding MH, Liu H, Zheng GH (2025) Determining a stationary mean field game system from full/partial boundary measurement. SIAM Journal on Mathematical Analysis 57(1): 661-681. https://doi.org/10.1137/23M1594327
[7] Eberhart R, Kennedy J (1995) A new optimizer using particle swarm theory. In MHS’95, Proceedings of the Sixth International Symposium on Micro Machine and Human Science, Nagoya, Japan, 39-43. https://doi.org/10.1109/MHS.1995.494215
[8] Gaugue A, Menard M, Migot E, Bourcier P, Gaschet C (2019) Development of an aquatic USV with high communication capability for environmental surveillance. OCEANS 2019-Marseille, Marseille, France, 1-8. https://doi.org/10.1109/OCEANSE.2019.8867256
[9] Hong SM, Nam KS, Ryu JD, Lee DG, Ha KN (2020) Development and field test of unmanned marine vehicle (USV/UUV) with cable. IEEE Access 8: 193347-193355. https://doi.org/10.1109/ACCESS.2020.3032961
[10] Huang Z, Lin H, Zhang G (2021) The USV path planning based on an improved DQN algorithm. 2021 International Conference on Networking, Communications and Information Technology (NetCIT), Beijing, China, 162-166. https://doi.org/10.1109/NetCIT53626.2021.00037
[11] Hao B, Du H, Yan Z (2023) A path planning approach for unmanned surface vehicles based on dynamic and fast Q-learning. Ocean Engineering 270: 113632. https://doi.org/10.1016/j.oceaneng.2022.113632
[12] Hu X, Wu H, Sun Q, Liu J (2023) Robot time optimal trajectory planning based on improved simplified particle swarm optimization algorithm. IEEE Access 11: 44496-44508. https://doi.org/10.1109/ACCESS.2023.3277035
[13] He K, Fei Y, Teng X, Chu X, Ma Z (2022) Optimal path planning for underwater robots based on improved ant colony algorithm. 2022 IEEE International Conference on Unmanned Systems (ICUS), Beijing, China, 1118-1123. https://doi.org/10.1109/ICUS56496.2022.10009089
[14] Imanuvilov O, Hongyu L, Yamamoto M (2023) Unique continuation for a mean field game system. Applied Mathematics Letters 145: 108757. https://doi.org/10.1016/j.aml.2023.108757
[15] Jiang L, Huang H, Ding Z (2019) Path planning for intelligent robots based on deep Q-learning with experience replay and heuristic knowledge. IEEE/CAA Journal of Automatica Sinica 7(4): 1179-1189. https://doi.org/10.1109/JAS.2019.1911749
[16] Khatib O (1986) Real-time obstacle avoidance for manipulators and mobile robots. The International Journal of Robotics Research 5(1): 90-98. https://doi.org/10.1177/027836498600500106
[17] Li Y, Zhao J, Chen Z, Xiong G, Liu S (2023) A robot path planning method based on improved genetic algorithm and improved dynamic window approach. Sustainability 15(5): 4656. https://doi.org/10.3390/su15054656
[18] Liu H, Lo CW (2025) Determining state space anomalies in mean field games. Nonlinearity 38(2): 025010. https://doi.org/10.1088/1361-6544/ada67d
[19] Liu H, Mou C, Zhang S (2023) Inverse problems for mean field games. Inverse Problems 39(8): 085003. https://doi.org/10.1088/1361-6420/acdd90
[20] Liu H, Zhang S (2025) Simultaneously recovering running cost and Hamiltonian in mean field games system. arXiv preprint arXiv: 2303.13096. https://doi.org/10.48550/arXiv.2303.13096
[21] Liu L, Shan Q, Xu Q (2024) USVs path planning for maritime search and rescue based on POS-DQN: Probability of success-deep Q-Network. Journal of Marine Science and Engineering 12(7): 1158. https://doi.org/10.3390/jmse12071158
[22] Mnih V, Kavukcuoglu K, Silver D, Graves A, Antonoglou I, Wierstra D, Riedmiller M (2013) Playing Atari with deep reinforcement learning. arXiv preprint arXiv: 1312.5602. https://doi.org/10.48550/arXiv.1312.5602
[23] Minsky ML (1954) Theory of neural-analog reinforcement systems and its application to the brain-model problem. PhD thesis, Princeton University, Princeton, USA
[24] Mac TT, Copot C, Tran DT, De Keyser R (2016) Heuristic approaches in robot path planning: A survey. Robotics and Autonomous Systems 86: 13-28. https://doi.org/10.1016/j.robot.2016.08.001
[25] Sang H, You Y, Sun X, Zhou Y, Liu F (2021) The hybrid path planning algorithm based on improved A* and artificial potential field for unmanned surface vehicle formations. Ocean Engineering 223: 108709. https://doi.org/10.1016/j.oceaneng.2021.108709
[26] Tamang J, Nkapkop JDD, Ijaz MF, Prasad PK, Tsafack N, Saha A, Kengne J, Son Y (2021) Dynamical properties of ion-acoustic waves in space plasma and its application to image encryption. IEEE Access 9: 18762-18782. https://doi.org/10.1109/ACCESS.2021.3053867
[27] Watkins CJ, Dayan P (1992) Q-learning. Machine Learning 8: 279-292. https://doi.org/10.1007/BF00992698
[28] Yang J, Huo J, Xi M, He J, Li Z, Song HH (2022) A time-saving path planning scheme for autonomous underwater vehicles with complex underwater conditions. IEEE Internet of Things Journal 10(2): 1001-1013. https://doi.org/10.1109/JIOT.2022.3191814
[29] Yu K, Liang XF, Li MZ, Chen Z, Yao YL, Li X, Zhao ZX, Teng Y (2021) USV path planning method with velocity variation and global optimisation based on AIS service platform. Ocean Engineering 236: 109560. https://doi.org/10.1016/j.oceaneng.2021.109560
[30] Zhang H, Huang Y, Qin H, Geng Z (2023a) USV search mission planning methodology for lost target rescue on sea. Electronics 12(22): 4584. https://doi.org/10.3390/electronics12224584
[31] Zhai H, Wang W, Zhang W, Li Q (2021) Path planning algorithms for USVs via deep reinforcement learning. 2021 China Automation Congress (CAC), Beijing, China, 4281-4286. https://doi.org/10.1109/CAC53057.2021.9727909
[32] Zhang M, Cai W, Pang L (2023b) Predator-prey reward based Q-learning coverage path planning for mobile robot. IEEE Access 11: 29673-29683. https://doi.org/10.1109/ACCESS.2023.3262236
[33] Zeng Z, Sammut K, Lian L, He F, Lammas A, Tang Y (2016) A comparison of optimization techniques for AUV path planning in environments with ocean currents. Robotics and Autonomous Systems 82: 61-72. https://doi.org/10.1016/j.robot.2016.04.002

Memo

Memo:
Received date:2024-12-12;Accepted date:2025-4-1。<br>Foundation item:Supported by the Science Research Foundation for Introduced Talents, Fujian Province of China under Grant Nos. GY-Z21215, GY-Z21216.<br>Corresponding author:Shanqiang Li,E-mail:lishanqiang@gmail.com
Last Update: 2026-06-08