可变服务率模式下基于需求驱动的传送带给料加工站系统的优化控制

唐昊; 许玲玲; 周雷; 谭琦

引用本文:	唐昊,许玲玲,周雷,谭琦.可变服务率模式下基于需求驱动的传送带给料加工站系统的优化控制[J].控制理论与应用,2015,32(6):810~816.[点击复制]
	TANG Hao,XU Ling-ling,ZHOU Lei,TAN Qi.Optimization control of demand-driven conveyor-serviced production station with changeable service rate[J].Control Theory and Technology,2015,32(6):810~816.[点击复制]

可变服务率模式下基于需求驱动的传送带给料加工站系统的优化控制

Optimization control of demand-driven conveyor-serviced production station with changeable service rate

摘要点击 1964 全文点击 1084 投稿时间：2014-03-18 修订日期：2015-03-19

查看全文查看/发表评论下载PDF阅读器

DOI编号 10.7641/CTA.2015.40217

2015,32(6):810-816

中文关键词传送带给料加工站可变服务率半马尔科夫决策过程 Q学习

英文关键词 conveyor-serviced production station changeable service rate semi-Markov decision process Q-learning

基金项目国家自然科学基金项目(61174186, 61374158, 71231004), 国家国际科技合作项目(2011FA10440), 教育部新世纪优秀人才计划项目(NCET--11-- 0626), 高等学校博士学科点专项科研基金项目(20130111110007)资助.

作者	单位	E-mail
唐昊^*	合肥工业大学电气与自动化工程学院合肥工业大学计算机与信息学院	htang@hfut.edu.cn
许玲玲	合肥工业大学计算机与信息学院
周雷	合肥工业大学计算机与信息学院
谭琦	合肥工业大学电气与自动化工程学院

中文摘要

本文主要研究可变服务率模式下基于需求驱动的传送带给料加工站(CSPS)系统的优化控制问题, 主要目标是对系统的随机优化控制问题进行建模和提供解决方案. 论文以缓冲库和成品库剩余容量为联合状态, 以站点前视距离和工件服务率为控制变量, 将其最优控制问题描述为半马尔科夫决策过程(SMDP)模型. 该模型为利用策略迭代等方法求解系统在平均准则或折扣准则下的最优控制策略提供了理论基础, 特别地, 据此可引入基于模拟退火思想的Q学习算法等优化方法来寻求近似解, 以克服理论求解过程中的维数灾和建模难等困难. 仿真结果说明了本文建立的数学模型及给出的优化方法的有效性.

英文摘要

The optimal control of demand-driven conveyor-serviced production station with changeable service rate is concerned in this paper. We focus on modeling the stochastic control problem and providing solutions. First, the vacancies of the buffer and the bank are jointed to be viewed as the system state, and the look-ahead range and service rate are viewed as the control variable. Then we set up in detail a semi-Markov decision process for the optimal control problem. As a result, policy iteration can be used to obtain the optimal look-ahead range and service rate under either average or discounted-cost criteria. Furthermore, to avoid the disaster of dimensionality and the difficulties of modeling in numerical optimization methods, we also propose a Q-learning algorithm combined with simulated annealing technique to derive the approximate solutions. Simulation results are finally used to validate the effectiveness of our established model and proposed optimization methods.