|Table of Contents|

 Prashant Bhopale,Faruk Kazi,Navdeep Singh.Reinforcement Learning Based Obstacle Avoidance for Autonomous Underwater Vehicle[J].Journal of Marine Science and Application,2019,(2):228-238.[doi:10.1007/s11804-019-00089-3]
Click and Copy

Reinforcement Learning Based Obstacle Avoidance for Autonomous Underwater Vehicle


Reinforcement Learning Based Obstacle Avoidance for Autonomous Underwater Vehicle
Prashant Bhopale Faruk Kazi Navdeep Singh
Prashant Bhopale Faruk Kazi Navdeep Singh
Electrical Engineering Department, Veermata Jijabai Technological Institute, Mumbai 400019, India
Obstacle avoidance becomes a very challenging task for an autonomous underwater vehicle (AUV) in an unknown underwater environment during exploration process. Successful control in such case may be achieved using the model-based classical control techniques like PID and MPC but it required an accurate mathematical model of AUV and may fail due to parametric uncertainties, disturbance, or plant model mismatch. On the other hand, model-free reinforcement learning (RL) algorithm can be designed using actual behavior of AUV plant in an unknown environment and the learned control may not get affected by model uncertainties like a classical control approach. Unlike model-based control model-free RL based controller does not require to manually tune controller with the changing environment. A standard RL based one-step Q-learning based control can be utilized for obstacle avoidance but it has tendency to explore all possible actions at given state which may increase number of collision. Hence a modified Q-learning based control approach is proposed to deal with these problems in unknown environment. Furthermore, function approximation is utilized using neural network (NN) to overcome the continuous states and large statespace problems which arise in RL-based controller design. The proposed modified Q-learning algorithm is validated using MATLAB simulations by comparing it with standard Q-learning algorithm for single obstacle avoidance. Also, the same algorithm is utilized to deal with multiple obstacle avoidance problems.


Bhopale P, Bajaria P, Kazi F, Singh N (2016) LMI based depth control for autonomous underwater vehicle. International Conference on Control, Instrumentation, Communication and Computational Technologies (ICCICCT), Kumaracoil, India, 477-481
Bhopale P, Bajaria P, Kazi F, Singh N (2017) Enhancing reduced order model predictive control for autonomous underwater vehicle. In:Le NT, van Do T, Nguyen N, Thi H (eds) Advanced computational methods for knowledge engineering. ICCSAMA 2017. Advances in intelligent systems and computing, vol 629. Springer, Cham, 60-71
Cheng X, Qu J, Yan Z, Bian X (2010) H∞ robust fault-tolerant controller design for an autonomous underwater vehicle’s navigation control system. J Mar Sci Appl 9(1):87-92. https://doi.org/10.1007/s11804-010-8052-x Council, National Research (1996) Underwater vehicles, and national needs. National Academies Press, Washington, DC, 1-6
Fossen T (2011) Handbook of marine craft hydrodynamics and motion control. John Wiley & Sons Ltd. Publication, 6-78
Hafner R, Riedmiller M (2014) Reinforcement learning in feedback control:challenges and benchmarks from technical process control.Mach Learn 84(1-2):137-169. https://doi.org/10.1007/s10994-011-5235-x
Kober J, Andrew B, Jan P (2013) Reinforcement learning in robotics:a survey. Int J Robotics Res 32(11):1238-1274. https://doi.org/10.1177/0278364913495721
Paula M, Acosta G (2015) Trajectory tracking algorithm for autonomous vehicles using adaptive reinforcement learning. Oceans 2015, Washington, DC, 1-8
Phanthong T, Maki T, Ura T, Sakamaki T, Aiyarak P (2014) Application of A* algorithm for real-time path re-planning of an unmanned surface vehicle avoiding underwater obstacles. J Mar Sci Appl 13(1):105-116. https://doi.org/10.1007/s11804-014-1224-3
Powell W (2007) Approximate dynamic programming:solving the curses of dimensionality. John Wiley and Sons Publication, 1-25
Prestero T (2001) Verification of six-degree of freedom simulation model for the REMUS autonomous underwater vehicle, MSc/ME Thesis.Massachusetts Institute of Technology, Cambridge, 1-78
Qu Y, Xu H, Yu W, Feng H, Han X (2017) Inverse optimal control for speed-varying path following of marine vessels with actuator dynamics. J Mar Sci Appl 16(2):225-236. https://doi.org/10.1007/s11804-017-1410-1
Russell B, Veerle A, Timothy P, Bramley J, Douglas P, Brian J, Henry A, Kirsty J, Jeffrey P, Daniel R, Esther J, Stephen E, Robert M, James E (2014) Autonomous underwater vehicles (AUVs):their past, present and future contributions to the advancement of marine geoscience.
Mar Geol 352:451-468. https://doi.org/10.1016/j.margeo.2014.03.012
Su Y, Zhao J, Cao J, Zhang G (2013) Dynamics modeling and simulation of autonomous underwater vehicles with appendages. J Mar Sci Appl 12(1):45-51. https://doi.org/10.1007/s11804-013-1169-6
Sutton R, Barto A (1998) Introduction to reinforcement learning. MIT Press, Cambridge, MA, USA, pp 1-150
Watkins CJCH, Dayan P (1992) Q-learning. Mach Learn 8(3-4):279-292. https://doi.org/10.1007/BF00992698
Yoo B, Kim J (2016) Path optimization for marine vehicles in ocean currents using reinforcement learning. J Mar Sci Technol 21(2):334-343. https://doi.org/10.1007/s00773-015-0355-9


Received date:2017-9-24;Accepted date:2018-3-19。
Corresponding author:Prashant Bhopale
Last Update: 2019-07-06