quotation:[Copy]
Masahiko SAKAGUCHI,Yoshio OHTSUBO.[en_title][J].Control Theory and Technology,2013,11(4):548~557.[Copy]
【Print page】 【Online reading】【Download 【PDF Full text】 View/Add CommentDownload reader Close

←Previous page|Page Next →

Back Issue    Advanced search

This Paper:Browse 1823   Download 50 本文二维码信息
码上扫一扫!
MasahikoSAKAGUCHI,YoshioOHTSUBO
0
(Department of Mathematics, Faculty of Science, Kochi University)
摘要:
关键词:  
DOI:
Received:August 21, 2012Revised:May 28, 2013
基金项目:
Markov decision processes associated with two threshold probability criteria
Masahiko SAKAGUCHI,Yoshio OHTSUBO
(Department of Mathematics, Faculty of Science, Kochi University,)
Abstract:
This paper deals with Markov decision processes with a target set for nonpositive rewards. Two types of threshold probability criteria are discussed. The first criterion is a probability that a total reward is not greater than a given initial threshold value, and the second is a probability that the total reward is less than it. Our first (resp. second) optimizing problem is to minimize the first (resp. second) threshold probability. These problems suggest that the threshold value is a permissible level of the total reward to reach a goal (the target set), that is, we would reach this set over the level, if possible. For the both problems, we show that 1) the optimal threshold probability is a unique solution to an optimality equation, 2) there exists an optimal deterministic stationary policy, and 3) a value iteration and a policy space iteration are given. In addition, we prove that the first (resp. second) optimal threshold probability is a monotone increasing and right (resp. left) continuous function of the initial threshold value and propose a method to obtain an optimal policy and the optimal threshold probability in the first problem by using them in the second problem.
Key words:  Markov decision process  Minimizing risk model  Threshold probability  Policy space iteration