Performance Analysis of Deep Q Networks and Advantage Actor Critic Algorithms in Designing Reinforcement Learning-based Self-tuning PID Controllers
Published in IEEE IBSSC, 2019
Use of Reinforcement Learning (RL) in designing adaptive self-tuning PID controllers is a relatively new horizon of research with Q-learning and its variants being the predominant algorithms found in the literature. However, the possibility of using an interesting alternative algorithm i.e. Advantage Actor Critic (A2C) in the above context is relatively unexplored. In the present study, Deep Q Networks (DQN) and A2C approaches have been employed to design self-tuning PID controllers. Comparative performance analysis of both the controllers was undertaken in a simulation environment on a servo position control system, with various static and dynamic control objectives, keeping a conventional PID controller as a baseline. A2C based Adaptive PID Controller(A2CAPID) is more promising in trajectory tracking problems whereas DQN based Adaptive PID Controller(DQNAPID) is rather suitable for systems with relatively large plant parameter variations. Read more
Recommended citation: Mukhopadhyay, R., Bandyopadhyay, S., Sutradhar, A. and Chattopadhyay, P., 2019, July. Performance Analysis of Deep Q Networks and Advantage Actor Critic Algorithms in Designing Reinforcement Learning-based Self-tuning PID Controllers. In 2019 IEEE Bombay Section Signature Conference (IBSSC) (pp. 1-6). IEEE. https://ieeexplore.ieee.org/abstract/document/8973068