引用本文:苏娜,唐昊,戴飞,王彬,周雷.非泊松工件流CSPS系统的Q学习算法适用性仿真研究[J].控制理论与应用,2020,37(12):2591~2600.[点击复制]
SU Na,TANG Hao,DAI Fei,WANG Bin,ZHOU Lei.Simulation research of the applicability of Q-learning algorithm in CSPS systems with non-Poisson part flow[J].Control Theory and Technology,2020,37(12):2591~2600.[点击复制]
非泊松工件流CSPS系统的Q学习算法适用性仿真研究
Simulation research of the applicability of Q-learning algorithm in CSPS systems with non-Poisson part flow
摘要点击 1760  全文点击 540  投稿时间:2018-10-11  修订日期:2020-07-15
查看全文  查看/发表评论  下载PDF阅读器
DOI编号  10.7641/CTA.2020.80782
  2020,37(12):2591-2600
中文关键词  传送带给料加工站  马尔可夫调制泊松过程  半马尔可夫调制泊松过程  Q学习
英文关键词  conveyor-serviced production station  Markov modulation Poisson process  semi-Markovian modulation Poisson process  Q-learning algorithm
基金项目  国家自然科学基金项目(61573126), 国家重点研发计划项目(2017YFGH002010), 中央高校基本科研业务费项目(JZ2016YYPY0052)资助.
作者单位E-mail
苏娜 合肥工业大学 电气与自动化工程学院 suna_nasu@mail.hfut.edu.cn 
唐昊* 合肥工业大学 电气与自动化工程学院 htang@hfut.edu.cn 
戴飞 合肥工业大学 电气与自动化工程学院  
王彬 合肥工业大学 电气与自动化工程学院  
周雷 合肥工业大学 计算机与信息学院  
中文摘要
      研究工件非泊松到达情况下, 传送带给料加工站(CSPS)系统无法建立成半马尔可夫决策过程(SMDP)模型 时, Q学习算法的适用性问题. 首先, 以马尔可夫调制泊松过程(MMPP)和半马尔可夫调制泊松过程(SMMPP)来模拟 非泊松工件流, 并在相同的平均到达率下, 仿真评估其Q学习算法性能, 并分别与泊松工件流情况下的Q学习算法性 能进行比较: 其次, 在非泊松工件流情况下, 观测以实时统计平均到达率作为工件标准泊松到达率的理论优化情况: 最后讨论在MMPP和SMMPP叠加混合非泊松工件流情况下CSPS 系统的Q学习算法性能. 实验表明, 在工件非泊松 到达情况下Q学习算法依然能学到较好的控制策略, 从而说明了CSPS系统中Q学习算法的适用性.
英文摘要
      This paper is mainly concerned the applicability of Q-learning algorithm when the parts arrive in accordance with the non-Poisson process and conveyor-serviced production station (CSPS) system cannot be established as a semi- Markov decision process (SMDP) model. Firstly, Markov modulation Poisson process (MMPP) and semi-Markovian modulation Poisson process (SMMPP) are used as the representative of the non-Poisson distribution arrival. And under the same average arrival rate, the performances are simulated by Q-learning algorithm and compared with the performance of the Q-learning algorithm under the Poisson parts flow. Secondly, in the case of non-Poisson parts flow, the observation is based on the real-time statistical average arrival rate as the theoretical optimization of the standard Poisson arrival rate of the parts. Finally, the performance of Q-learning algorithm for CSPS system in the case of MMPP and SMMPP mixed non- Poisson parts flow is discussed. Simulation results show that Q-learning algorithm can still study a good control strategy when the parts are non-Poisson, which shows the applicability of Q-learning algorithm in CSPS system.