基于改进SAC的倒立摆控制算法研究
DOI:
作者:
作者单位:

西安科技大学通信与信息工程学院 西安 710600

作者简介:

通讯作者:

中图分类号:

TP18

基金项目:

国家自然科学基金青年项目(61901358)、中国博士后科学基金面上项目(2019QDJ207)、陕西省教育厅一般专项(20JK0757)资助


Research on the control algorithm of inverted pendulum based on improved SAC
Author:
Affiliation:

College of Communication and Information Engineering, Xi′an University of Science and Technology,Xi′an 710600, China

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    针对倒立摆系统控制过程中易受外界干扰和自然不稳定的特点,以及深度强化学习SAC算法采样数据利用率较低和随机离线策略网络收敛较慢的问题,提出了一种结合近端经验采样和优化策略网络结构的改进算法PRER_SAC。构建神经网络拟合函数,策略网络使用性能更优的Mish函数作为激活函数,设置自调节温度系数以增强智能体的探索能力;设计远、近两个经验池,及一种改变数据存放频率的训练策略,提高数据样本的利用率。通过仿真实验对比,所提方法在同等训练次数下所得回报值和算法收敛速度优于DDPG和 SAC 算法,同传统控制方法PID和LQR相比,有更好的控制效果。最后,对训练好的智能体加入角度扰动,可在2 s内被消除抑制,证明提出的算法具有较强的适用性。

    Abstract:

    In response to the characteristics of external interference and natural instability in the control process of inverted pendulum systems, and the problems of low utilization of sampling data and slow convergence of random offline strategy networks in deep reinforcement learning SAC algorithm, an improved algorithm PRER_SAC is proposed that combines recency experience sampling and optimize policy network structure. The neural network fitting function is constructed,the policy network uses the better performance Mish function as the activation function, and sets the self-adjusting temperature coefficient to enhance the exploration ability of agent. Design two experience pools, far and near, and a training strategy to change the frequency of data storage. Through simulation experiments, the return value and convergence speed of the proposed method under the same number of training times are better than DDPG and SAC algorithms, and have better control effects than the traditional control methods PID and LQR. Finally, the angle disturbance added to the trained agent can be eliminated within 2 s, which proves that the proposed algorithm has strong applicability.

    参考文献
    相似文献
    引证文献
引用本文

张晓莉,郭仕林,刘鼎,宋婉莹.基于改进SAC的倒立摆控制算法研究[J].电子测量技术,2024,47(1):93-100

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2024-04-24
  • 出版日期: