引用本文:孙子健,汤健,乔俊飞.面向工业过程难测参数建模的双窗口概念漂移检测[J].控制理论与应用,2021,38(12):1979~1992.[点击复制]
SUN Zi-jian,TANG Jian,QIAO Jun-fei.Double window concept drift detection method for modeling of difficult-to-measure parameter in industrial processes[J].Control Theory and Technology,2021,38(12):1979~1992.[点击复制]
面向工业过程难测参数建模的双窗口概念漂移检测
Double window concept drift detection method for modeling of difficult-to-measure parameter in industrial processes
摘要点击 1154  全文点击 364  投稿时间:2020-09-11  修订日期:2021-03-08
查看全文  查看/发表评论  下载PDF阅读器
DOI编号  10.7641/CTA.2021.00616
  2021,38(12):1979-1992
中文关键词  概念漂移  数据窗口  统计检验  样本分布  软测量
英文关键词  concept drift  data window  statistical test  sample distribution  soft sensor model
基金项目  国家自然科学基金项目(62073006, 62021003, 61890930–5), 北京市自然科学基金项目(4212032, 4192009), 科学技术部国家重点研发计划项 目(2018YFC1900800–5), 矿冶过程自动控制技术国家(北京市)重点实验室项目(BGRIMM–KZSKL–2020–02)资助.
作者单位E-mail
孙子健 北京工业大学 sunzj@emails.bjut.edu.cn 
汤健 北京工业大学  
乔俊飞* 北京工业大学 junfeiq@bjut.edu.cn 
中文摘要
      针对工业过程数据固有概念漂移特性导致软测量模型性能恶化、需识别漂移样本以有效更新模型等问题, 提出一种面向工业过程难测参数建模的双窗口概念漂移检测方法. 首先, 在离群样本检测窗口采用支持向量回归 获得实时过程数据中包含的离群样本; 接着, 在分布检测窗口计算离群样本与历史样本集间的欧氏距离; 最后, 结 合多种分布检验方法, 新定义能够表征离群样本蕴含分布变化的检验漂移度指标, 进而实现漂移样本的有效识别. 采用合成和真实工业过程数据集验证了所提方法的有效性, 表明具有优于已有方法的性能.
英文摘要
      The inherent concept drift characteristics of industrial process data leads to the deterioration of the soft sensor model’s performance. Thus, the first problem is to identify drift samples to effectively update the model. Aiming at these problems, a double-window concept drift detection method oriented to the modeling of difficult-to-measure parameters of industrial processes is proposed. First, support vector regression is used in the outlier sample detection window to obtain the outlier samples contained in the real-time process data. Then, the Euclidean distance between the outlier sample and the historical sample set is calculated in the distribution detection window; Next, a test drift index combined with a variety of distribution test methods that can characterize the distribution changes contained in outlier samples is defined, so as to realize effective identification of drift samples. Finally, synthetic and real industrial process data sets are used to verify the effectiveness of the proposed method, which shows better performance than existing methods.