引用本文:李春生,王耀南.聚类中心初始化的新方法[J].控制理论与应用,2010,27(10):1435~1440.[点击复制]
LI Chun-sheng,WANG Yao-nan.New initialization method for cluster center[J].Control Theory and Technology,2010,27(10):1435~1440.[点击复制]
聚类中心初始化的新方法
New initialization method for cluster center
摘要点击 2632  全文点击 1499  投稿时间:2008-09-02  修订日期:2010-01-03
查看全文  查看/发表评论  下载PDF阅读器
DOI编号  10.7641/j.issn.1000-8152.2010.10.CCTA080927
  2010,27(10):1435-1440
中文关键词  最小支撑树  聚类中心初始化  k-means算法
英文关键词  cluster center initialization  minimum spanning tree  k-means algorithm
基金项目  国家“863”计划重点资助项目(2007AA04Z224); 国家自然科学基金重点资助项目(60835004).
作者单位E-mail
李春生* 广东商学院 数学与计算科学学院
湖南大学 电气与信息工程学院 
lcs0200731@yahoo.com.cn 
王耀南 湖南大学 电气与信息工程学院  
中文摘要
      k–均值聚类算法易受初始聚类中心的影响而陷入局部最优解. 现有聚类中心初始化方法尚未得到广泛认可. 本文依据每个类内至少有一个数据稠密区, 且处于不同类的数据稠密区比处于同一类的数据稠密区相距更远的假设, 在数据集合上构造一棵最小支撑树, 应用根树原理在其上搜索数据稠密区并估计其密度, 从中选出密度大且足够分离的数据稠密区, 以其内的点作为初始聚类中心, 得到了一个聚类中心初始化的新方法. 将此方法与现有的方法进行比较, 仿真实验表明, 本文方法性能更优越.
英文摘要
      The k-means clustering algorithm is prone to be trapped into local optima by inappropriate initial cluster centers. For this reason, the existing initialization methods for the cluster center have not been widely accepted. We assume that there is at least one dense subset of data in a cluster; and the dense subsets between different clusters are more distant than those in the same cluster. A minimum spanning tree is built for the given data set. The dense subsets can be found through the search from root trees, and their densities are obtained by the estimation technique for data density. The initial cluster centers are picked out from the dense subsets that are dense enough and distant enough from each other. The comparisons between the proposed method and current methods show that the performance of the proposed method is promising.