Reinforcement Learning-Based Trajectory Control for Mecanum Robot with Mass Eccentricity Considerations

Minh Dong Nguyen, Manh Tien Ngo, Hiep Do Quang, Nam Dao Phuong

Abstract


This article presents a robust optimal tracking control approach for a Four Mecanum Wheeled Robot (FMWR) using an online actor-critic reinforcement learning (RL) algorithm to address the challenge of precise trajectory tracking problem in the presence of mass eccentricity and friction uncertainty. In order to handle these obstacles, a detailed dynamics model is derived using Lagrange’s equation, and the Hamilton–Jacobi–Bellman (HJB) equation is solved by iteration algorithm with policy evaluation and improvement. The training laws of optimal control law and value function are proposed after minimizing the modified Hamiltonian function. Moreover, to handle the time-varying property of tracking error model, a transform is given with the addition of time derivative term. Simulation Studies demonstrate the approach’s effectiveness, significantly improving trajectory tracking accuracy and robustness against disturbances. This research contributes to mobile robotics by enhancing control precision and reliability in dynamic environments.

Keywords


Optimal Control; Reinforcement Learning; Trajectory Tracking; Friction Uncertainty.

Full Text:

PDF

References


I. Zeidis and K. Zimmermann, “Dynamics of a four-wheeled mobile robot with Mecanum wheels,” ZAMM-Journal of Applied Mathematics and Mechanics/Zeitschrift fur Angewandte Mathematik und Mechanik, vol. 99, no. 12, 2019, doi: 10.1002/zamm.201900173.

H. Taheri, B. Qiao, and N. Ghaeminazhad, “Kinematic model of a four mecanum wheeled mobile robot,” International journal of computer applications, vol. 113, no. 3, pp. 6-9, 2015, doi: 10.5120/19804-1586.

M. Abdelrahman et al., “A description of the dynamics of a four-wheel Mecanum mobile system as a basis for a platform concept for special purpose vehicles for disabled persons,” Ilmenau Scientific Colloquium, pp. 1–10, 2014.

N. Tlale and M. de Villiers, “Kinematics and Dynamics Modelling of a Mecanum Wheeled Mobile Platform,” 2008 15th International Conference on Mechatronics and Machine Vision in Practice, Auckland, New Zealand, pp. 657-662, 2008, doi: 10.1109/MMVIP.2008.4749608.

L. C. Lin, and H. Y. Shih, “Modeling and adaptive control of an omnimecanum-wheeled robot,” Intelligent Control and Automation, vol. 4, no. 2, 2013, doi: 10.4236/ica.2013.42021.

F. Becker et al., “An approach to the kinematics and dynamics of a fourwheel Mecanum vehicle,” Special Issue of Scientific Journal of Iftomm Problems of Mechanics, pp. 27–37, 2014.

P. Wu, K. Wang, J. Zhang, and Q. Zhang, “Optimal design for pid controller based on de algorithm in omnidirectional mobile robot,” In MATEC Web of Conferences, vol. 95, 2017, doi: 10.1051/matecconf/20179508014.

C. S. Shijin and K. Udayakumar, “Speed control of wheeled mobile robots using PID with dynamic and kinematic modelling,” 2017 International Conference on Innovations in Information, Embedded and Communication Systems (ICIIECS), pp. 1-7, 2017, doi: 10.1109/ICIIECS.2017.8275962.

A. Katpatal, A. Parwekar, and A. K. Jha, “Model-Based synchronized control of a robotic dual-arm manipulator,” In Advances in Mechanical Engineering, pp. 645-654, 2021, doi: 10.1007/978-981-15-3639-7_77.

B. Dumitrascu, A. Filipescu, V. Minzu and A. Filipescu, “Backstepping control of wheeled mobile robots,” 15th International Conference on System Theory, Control and Computing, pp. 1-6, 2011.

R. Solea, A. Filipescu and U. Nunes, “Sliding-mode control for trajectorytracking of a Wheeled Mobile Robot in presence of uncertainties,” 2009 7th Asian Control Conference, pp. 1701-1706, 2009.

J. Kober, J. A. Bagnell, and J. Peters, “Reinforcement learning in robotics: A survey,” The International Journal of Robotics Research, vol. 32, no. 11, pp. 1238-1274, 2023, doi: 10.1177/0278364913495721.

L. Zuo, X. Xu, C. Liu, and Z. Huang, “A hierarchical reinforcement learning approach for optimal path tracking of wheeled mobile robots,” Neural Computing and Applications, vol. 23, pp. 1873–1883, 2013, doi: 10.1007/s00521-012-1243-4.

Y. -J. Liu, L. Tang, S. Tong, C. L. P. Chen and D. -J. Li, “Reinforcement Learning Design-Based Adaptive Tracking Control With Less Learning Parameters for Nonlinear Discrete-Time MIMO Systems,” in IEEE Transactions on Neural Networks and Learning Systems, vol. 26, no. 1, pp. 165-176, 2015, doi: 10.1109/TNNLS.2014.2360724.

S. Li, L. Ding, H. Gao, Y. -J. Liu, N. Li and Z. Deng, “Reinforcement Learning Neural Network-Based Adaptive Control for State and Input Time-Delayed Wheeled Mobile Robots,” in IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 50, no. 11, pp. 4171-4182, 2020, doi: 10.1109/TSMC.2018.2870724.

K. G. Vamvoudakis and F. L. Lewis, “Online actor critic algorithm to solve the continuous-time infinite horizon optimal control problem,” 2009 International Joint Conference on Neural Networks, pp. 3180-3187, 2009, doi: 10.1109/IJCNN.2009.5178586.

R. Kamalapurkar et al., “Approximate optimal trajectory tracking for continuous-time nonlinear systems,” Automatica, vol. 51, pp. 40-48, 2015, doi: 10.1016/j.automatica.2014.10.103.

H. Zhang, Q. Wei and Y. Luo, “A Novel Infinite-Time Optimal Tracking Control Scheme for a Class of Discrete-Time Nonlinear Systems via the Greedy HDP Iteration Algorithm,” in IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), vol. 38, no. 4, pp. 937-942, 2008, doi: 10.1109/TSMCB.2008.920269.

Z. Hendzel, “Robust neural networks control of omni-mecanum wheeled robot with hamilton-jacobi inequality,” Journal of Theoretical and Applied Mechanics, vol. 56, no. 4, pp. 1193-1204, 2018, doi: 10.15632/jtampl.56.4.1193.

K. G. Vamvoudakis, and F. L. Lewis, “Online actor–critic algorithm to solve the continuous-time infinite horizon optimal control problem,” Automatica, vol. 46, no. 5, pp. 878-888, 2010, doi: 10.1016/j.automatica.2010.02.018.

R. Kamalapurkar, P. Walters, J. Rosenfeld, and W. E. Dixon, “Modelbased reinforcement learning for approximate optimal regulation,” Reinforcement Learning for Optimal Feedback Control, pp. 99-148, 2018, doi: 10.1007/978-3-319-78384-0_4.

S. P. Singh, “Reinforcement learning with a hierarchy of abstract models,” In Proceedings of the National Conference on Artificial Intelligence, vol. 10, 1992.

D. Mitrovic, S. Klanke, and S. Vijayakumar, “Adaptive optimal feedback control with learned internal dynamics models,” From motor learning to interaction learning in robots, pp. 65-84, 2010, doi: 10.1007/978-3-642-05181-4_4.

P. Abbeel, M. Quigley, and A. Y. Ng, “Using inaccurate models in reinforcement learning,” In Proceedings of the 23rd international conference on Machine learning, pp. 1-8. 2006, doi: 10.1145/1143844.1143845.

P. Mehta and S. Meyn, “Q-learning and Pontryagin’s Minimum Principle,” Proceedings of the 48h IEEE Conference on Decision and Control (CDC) held jointly with 2009 28th Chinese Control Conference, pp. 3598-3605, 2009, doi: 10.1109/CDC.2009.5399753.

K. G. Vamvoudakis, and F. L. Lewis, “Online actor–critic algorithm to solve the continuous-time infinite horizon optimal control problem,” Automatica, vol. 46, no. 5, pp. 878-888, 2010, doi: 10.1016/j.automatica.2010.02.018.

H. Zhang, L. Cui, X. Zhang and Y. Luo, “Data-Driven Robust Approximate Optimal Tracking Control for Unknown General Nonlinear Systems Using Adaptive Dynamic Programming Method,” in IEEE Transactions on Neural Networks, vol. 22, no. 12, pp. 2226-2236, 2011, doi: 10.1109/TNN.2011.2168538.

S. Bhasin et al., “A novel actor–critic–identifier architecture for approximate optimal control of uncertain nonlinear systems,” Automatica, vol. 49, no. 1, pp. 82-92, 2013, doi: 10.1016/j.automatica.2012.09.019.

S. Bhasin et al., “A novel actor–critic–identifier architecture for approximate optimal control of uncertain nonlinear systems,” Automatica, vol. 49, no. 1, pp. 82-92, 2013, doi: 10.1016/j.automatica.2012.09.019.

R. Ortega PhD et al., “Euler-Lagrange systems,” Passivity-based Control of Euler-Lagrange Systems, pp. 15-37, 1998, doi: 10.1007/978-1-4471- 3603-3_2.

B. A. Finlayson, “The method of weighted residuals and variational principles,” Classics in Applied Mathematics, 2013, doi: 10.1137/1.9781611973242.

R. Kamalapurkar et al., “Approximate optimal trajectory tracking for continuous-time nonlinear systems,” Automatica, vol. 51, pp. 40-48, 2015, doi: 10.1016/j.automatica.2014.10.103.

H. Modares and F. L. Lewis, “Optimal tracking control of nonlinear partially-unknown constrained-input systems using integral reinforcement learning,” Automatica, vol. 50, no. 7, pp. 1780-1792, 2014, doi: 10.1016/j.automatica.2014.05.011.

D. Zhang, G. Wang and Z. Wu, “Reinforcement Learning-Based Tracking Control for a Three Mecanum Wheeled Mobile Robot,” in IEEE Transactions on Neural Networks and Learning Systems, vol. 35, no. 1, pp. 1445-1452, 2024, doi: 10.1109/TNNLS.2022.3185055.

C. -C. Tsai, Y. -S. Chen and F. -C. Tai, “Intelligent adaptive distributed consensus formation control for uncertain networked heterogeneous swedish-wheeled omnidirectional multi-robots,” 2016 55th Annual Conference of the Society of Instrument and Control Engineers of Japan (SICE), pp. 154-159, 2016, doi: 10.1109/SICE.2016.7749219.

C. -C. Tsai, Y. -S. Chen and F. -C. Tai, “Intelligent adaptive distributed consensus formation control for uncertain networked heterogeneous swedish-wheeled omnidirectional multi-robots,” 2016 55th Annual Conference of the Society of Instrument and Control Engineers of Japan (SICE), pp. 154-159, 2016, doi: 10.1109/SICE.2016.7749219.

X. Yu and Z. Man, “Model reference adaptive control systems with terminal sliding modes,” International Journal of Control, vol. 64, no. 6, pp. 1165-1176, 2007, doi: 10.1080/00207179608921680.

H. Y. Zheng, Fuzzy cooperative EKF localization and adaptive integral terminal sliding-mode formation control using fuzzy broad learning system and artificial potential function for uncertain networking heterogeneous omnidirectional multirobots, Doctoral dissertation, Master Thesis, Department of Electrical Engineering, National Chung Hsing University, 2020.

C. L. P. Chen, “Broad Learning System and its Structural Variations,” 2018 IEEE 16th International Symposium on Intelligent Systems and Informatics (SISY), pp. 000011-00012, 2018, doi: 10.1109/SISY.2018.8524681.

S. Feng and C. L. P. Chen, “Fuzzy Broad Learning System: A Novel Neuro-Fuzzy Model for Regression and Classification,” in IEEE Transactions on Cybernetics, vol. 50, no. 2, pp. 414-424, 2020, doi: 10.1109/TCYB.2018.2857815.

C. C. Tsai et al., “Adaptive Reinforcement Learning Formation Control Using ORFBLS for Omnidirectional Mobile Multi-Robots,” International Journal of Fuzzy Systems, vol. 25, pp. 1756–1769, 2023, doi: 10.1007/s40815-023-01491-4.

Chuan-Kai Lin, “Adaptive critic autopilot design of Bank-to-turn missiles using fuzzy basis function networks,” in IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), vol. 35, no. 2, pp. 197-207, 2005, doi: 10.1109/TSMCB.2004.842246.

C. -K. Lin, “H∞ reinforcement learning control of robot manipulators using fuzzy wavelet networks,” Fuzzy Sets and Systems, vol. 160, no. 12, pp. 1765-1786, 2009, doi: 10.1016/j.fss.2008.09.010.

Y. Hu, W. Wang, H. Liu and L. Liu, “Reinforcement Learning Tracking Control for Robotic Manipulator With Kernel-Based Dynamic Model,” in IEEE Transactions on Neural Networks and Learning Systems, vol. 31, no. 9, pp. 3570-3578, 2020, doi: 10.1109/TNNLS.2019.2945019.

P. Zhu, W. Dai, W. Yao, J. Ma, Z. Zeng and H. Lu, “Multi-Robot Flocking Control Based on Deep Reinforcement Learning,” in IEEE Access, vol. 8, pp. 150397-150406, 2020, doi: 10.1109/ACCESS.2020.3016951.

X. Gao, R. Gao, P. Liang, Q. Zhang, R. Deng and W. Zhu, “A Hybrid Tracking Control Strategy for Nonholonomic Wheeled Mobile Robot Incorporating Deep Reinforcement Learning Approach,” in IEEE Access, vol. 9, pp. 15592-15602, 2021, doi: 10.1109/ACCESS.2021.3053396.

C. C. Tsai et al., “Adaptive Reinforcement Learning Formation Control Using ORFBLS for Omnidirectional Mobile Multi-Robots,” International Journal of Fuzzy Systems, vol. 25, pp. 1756–1769, 2023, doi: 10.1007/s40815-023-01491-4.

C. C. Tsai, C. -C. Chan, Y. -C. Li, and F. -C. Tai, “Intelligent adaptive PID control using fuzzy broad learning system: an application to toolgrinding servo control systems,” International Journal of Fuzzy Systems, vol. 22, pp. 2149–2162, 2020, doi: 10.1007/s40815-020-00913-x.

D. Wang, M. Ha, and M. Zhao, Advanced Optimal Control and Applications Involving Critic Intelligence, Springer, Singapore, 2023, doi: 10.1007/978-981-19-7291-1.

M. G. Kumar et al., “A nonlinear hidden layer enables actor–critic agents to learn multiple paired association navigation,” Cerebral Cortex, vol. 32, no. 18, pp. 3917–3936, doi: 10.1093/cercor/bhab456.

Y. Zhang, L. Guo, B. Gao, T. Qu and H. Chen, “Deterministic Promotion Reinforcement Learning Applied to Longitudinal Velocity Control for Automated Vehicles,” in IEEE Transactions on Vehicular Technology, vol. 69, no. 1, pp. 338-348, 2020, doi: 10.1109/TVT.2019.2955959.

Z. Feng and J. Fei, “Super-Twisting Sliding Mode Control for Micro Gyroscope Based on RBF Neural Network,” in IEEE Access, vol. 6, pp. 64993-65001, 2018, doi: 10.1109/ACCESS.2018.2877398.

W. Lin, X. Huo, Z. Jin, B. Wu and Z. Liu, “Sliding Mode Control of Manipulator Based on Nominal Model and Nonlinear Disturbance Observer,” IECON 2018 - 44th Annual Conference of the IEEE Industrial Electronics Society, pp. 5519-5524, 2018, doi: 10.1109/IECON.2018.8592926.

Y. Pan, X. Li, H. Wang, and H. Yu, “Continuous sliding mode control of compliant robot arms: A singularly perturbed approach,” Mechatronics, vol. 52, pp. 127-134, 2018, doi: 10.1016/j.mechatronics.2018.04.005.

K. Bai, X. Gong, S. Chen, Y. Wang, Z. Liu, “Sliding mode nonlinear disturbance observer-based adaptive back-stepping control of a humanoid robotic dual manipulator,” Robotica, vol. 36, no. 11, pp. 1728–1742, 2018, doi: 10.1017/S026357471800067X.

S. Khorashadizadeh and M. Sadeghijaleh, “Adaptive fuzzy tracking control of robot manipulators actuated by permanent magnet synchronous motors,” Computers and Electrical Engineering, vol. 72, pp. 100-111, 2018, doi: 10.1016/j.compeleceng.2018.09.010.

M. Chen, S. S. Ge and B. V. E. How, “Robust Adaptive Neural Network Control for a Class of Uncertain MIMO Nonlinear Systems With Input Nonlinearities,” in IEEE Transactions on Neural Networks, vol. 21, no. 5, pp. 796-812, 2010, doi: 10.1109/TNN.2010.2042611.

H. Han and J. Qiao, “A Self-Organizing Fuzzy Neural Network Based on a Growing-and-Pruning Algorithm,” in IEEE Transactions on Fuzzy Systems, vol. 18, no. 6, pp. 1129-1143, 2010, doi: 10.1109/TFUZZ.2010.2070841.

C. L. P. Chen, Y. -J. Liu and G. -X. Wen, “Fuzzy Neural Network-Based Adaptive Control for a Class of Uncertain Nonlinear Stochastic Systems,” in IEEE Transactions on Cybernetics, vol. 44, no. 5, pp. 583-593, 2014, doi: 10.1109/TCYB.2013.2262935.

Q. Yang, Z. Yang and Y. Sun, “Universal Neural Network Control of MIMO Uncertain Nonlinear Systems,” in IEEE Transactions on Neural Networks and Learning Systems, vol. 23, no. 7, pp. 1163-1169, 2012, doi: 10.1109/TNNLS.2012.2197219.

D. N. Minh et al., “An Adaptive Fuzzy Dynamic Surface Control Tracking Algorithm for Mecanum Wheeled Mobile Robots,” International Journal of Mechanical Engineering and Robotics Research, vol. 12, no. 6, pp. 354-361, 2023, doi: 10.18178/ijmerr.12.6.354-361.

Z. Yuan et al., “Trajectory tracking control of a four mecanum wheeled mobile platform: an extended state observer-based sliding mode approach,” IET Control Theory and Applications, vol. 14, no. 3, pp. 415-426, 2023, doi: 10.1049/iet-cta.2018.6127.

M. A. -Khalaf, and F. L. Lewis, “Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach,” Automatica, vol. 41, no. 5, 2005, pp. 779-791, 2005, doi: 10.1016/j.automatica.2004.11.034.

D. Vrabie and F. Lewis, “Neural network approach to continuoustime direct adaptive optimal control for partially unknown nonlinear systems,” Neural Networks, vol. 22, no. 3, pp. 237-246, doi: 10.1016/j.neunet.2009.03.008.

S. Bhasin et al., “A novel actor–critic–identifier architecture for approximate optimal control of uncertain nonlinear systems,” Automatica, vol. 49, no. 1, pp. 82-92, 2013, doi: 10.1016/j.automatica.2012.09.019.

H. Modares, F. L. Lewis and M. -B. Naghibi-Sistani, “Adaptive Optimal Control of Unknown Constrained-Input Systems Using Policy Iteration and Neural Networks,” in IEEE Transactions on Neural Networks and Learning Systems, vol. 24, no. 10, pp. 1513-1525, 2013, doi: 10.1109/TNNLS.2013.2276571.

K. G. Vamvoudakis, and F. L. Lewis, “Online actor–critic algorithm to solve the continuous-time infinite horizon optimal control problem,” Automatica, vol. 46, no. 5, pp. 878-888, 2010, doi: 10.1016/j.automatica.2010.02.018.

K. G. Vamvoudakis, D. Vrabie and F. L. Lewis, “Online adaptive learning of optimal control solutions using integral reinforcement learning,” 2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL), pp. 250-257, 2011, doi: 10.1109/ADPRL.2011.5967359.

H. -N. Wu and B. Luo, “Neural Network Based Online Simultaneous Policy Update Algorithm for Solving the HJI Equation in Nonlinear H∞ Control,” in IEEE Transactions on Neural Networks and Learning Systems, vol. 23, no. 12, pp. 1884-1895, 2012, doi: 10.1109/TNNLS.2012.2217349.

T. Basar and P. Bernhard, “H/sup /spl infin//-0ptimal Control and Related Minimax Design Problems: A Dynamic Game Approach,” in IEEE Transactions on Automatic Control, vol. 41, no. 9, pp. 1397-, 1996, doi: 10.1109/TAC.1996.536519.

J. Kim and I. Yang, “Hamilton-Jacobi-Bellman equations for Q-learning in continuous time,” Proceedings of Machine Learning Research, vol. 120, pp. 1–10, 2020.

M. A. Bucci, “Nonlinear optimal control using deep reinforcement learning,” In IUTAM Laminar-Turbulent Transition: 9th IUTAM Symposium, pp. 279-290, 2022, doi: 10.1007/978-3-030-67902-6_24.

B. Luo, H. -N. Wu, T. Huang, and D. Liu. “Reinforcement learning solution for HJB equation arising in constrained optimal control problem,” Neural Networks, vol. 71, pp. 150-158, 2015, doi: 10.1016/j.neunet.2015.08.007.

H. Wiltzer, D. Meger, and M. G. Bellemare, “Distributional hamiltonjacobi-bellman equations for continuous-time reinforcement learning,” In International Conference on Machine Learning, pp. 23832-23856, 2022, doi: 10.48550/arXiv.2205.12184.

J. W. Kim et al., “A model-based deep reinforcement learning method applied to finite-horizon optimal control of nonlinear control-affine system,” Journal of Process Control vol. 87, pp. 166-178, 2020, doi: 10.1016/j.jprocont.2020.02.003.

W. Xiao, Q. Zhou, Y. Liu, H. Li and R. Lu, “Distributed Reinforcement Learning Containment Control for Multiple Nonholonomic Mobile Robots,” in IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 69, no. 2, pp. 896-907, 2022, doi: 10.1109/TCSI.2021.3121809.

N. T. Luy, N. T. Thanh, and H. M. Tri, “Reinforcement learning-based intelligent tracking control for wheeled mobile robot,” Transactions of the Institute of Measurement and Control, vol. 36, no. 7, pp. 868-877, 2014, doi: 10.1177/0142331213509828.

L. Ding, M. Zheng, S. Li, H. Yang, H. Gao, and Z. Deng, “Neural-based online finite-time optimal tracking control for wheeled mobile robotic system with inequality constraints,” Asian Journal of Control, vol. 26, no. 1, pp. 297-311, 2024, doi: 10.1002/asjc.3203.

N. D. Dien, N. T. Luy, L. K. Lai, and T. T. Hai, “Optimal tracking control for robot manipulators with input constraint based reinforcement learning,” Journal of Computer Science and Cybernetics, vol. 39, no. 2, pp. 175-189, 2023, doi: 10.15625/1813-9663/18099.

H. V. Doan and N. T. -T. Vu, “Robust optimal control for uncertain wheeled mobile robot based on reinforcement learning: ADP approach,” Bulletin of Electrical Engineering and Informatics, vol. 13, no. 3, pp. 1524-1534, 2024, doi: 10.11591/eei.v13i3.7054.




DOI: https://doi.org/10.18196/jrc.v5i5.22148

Refbacks

  • There are currently no refbacks.


Copyright (c) 2024 Minh Dong Nguyen, Manh Tien Ngo, Hiep Do Quang, Nam Dao Phuong

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

 


Journal of Robotics and Control (JRC)

P-ISSN: 2715-5056 || E-ISSN: 2715-5072
Organized by Peneliti Teknologi Teknik Indonesia
Published by Universitas Muhammadiyah Yogyakarta in collaboration with Peneliti Teknologi Teknik Indonesia, Indonesia and the Department of Electrical Engineering
Website: http://journal.umy.ac.id/index.php/jrc
Email: jrcofumy@gmail.com


Kuliah Teknik Elektro Terbaik