期刊文献+

一种有效的用于数据挖掘的动态概念聚类算法 被引量:16

An Efficient Dynamic Conceptual Clustering Algorithm for Data Mining
下载PDF
导出
摘要 概念聚类适用于领域知识不完整或领域知识缺乏时的数据挖掘任务 .定义了一种基于语义的距离判定函数 ,结合领域知识对连续属性值进行概念化处理 ,对于用分类属性和数值属性混合描述数据对象的情况 ,提出了一种动态概念聚类算法 DDCA(domain- based dynamic clustering algorithm) .该算法能够自动确定聚类数目 ,依据聚类内部属性值的频繁程度修正聚类中心 ,通过概念归纳处理 ,用概念合取表达式解释聚类输出 .研究表明 ,基于语义距离判定函数和基于领域知识的动态概念聚类的算法 DDCA是有效的 . Conceptual clustering analysis is suitable to discover the knowledge in database with incomplete or absent domain background information. It is difficult for original conceptual clustering method to deal with the data objects described by numerical attribute values. A new criterion function based on semantic distance is proposed in this paper, and a novel domain based dynamic conceptual clustering algorithm (DDCA) is also presented. With the discretization of the continuous attribute values, it works well on the datasets that are described by mixed numerical attributes and categorical attributes. The algorithm automatically determines the number of clusters, modifies the demoid according to the frequency of the attribute values within each cluster and gives out the interpretations of the clustering with the conceptual complex expression. The experiments demonstrate that the semantic based criterion function and the dynamic conceptual clustering algorithm are effective and efficient.
出处 《软件学报》 EI CSCD 北大核心 2001年第4期582-591,共10页 Journal of Software
基金 国家自然科学基金资助项目! (6 9835 0 10 )&&
关键词 数据挖掘 领域知识 动态概念聚类算法 数据对象 数据集合 数据库 data mining dynamic conceptual clustering semantic distance domain knowledge
  • 相关文献

参考文献5

二级参考文献7

  • 11,Agrawal R, Mannila H, Srikant R et al. Fast discovery of association rules. In: Fayyad M, Piatetsky-Shapiro G, Smyth P eds. Advances in Knowledge Discovery and Data Mining. Menlo Park, California: AAAI/MIT Press, 1996. 307-328 被引量:1
  • 22,Brin S, Motwani R, Ullman J D et al. Dynamic itemset counting and implication rules for market basket data. In: Proc the ACM SIGMOD International Conference on Management of Data, Tucson, Arizon, 1997. 255-264 被引量:1
  • 33,Fayyad U M, Piatesky-shapiro G, Smyth P P. From data mining to knowledge discovery: an overview. In: Fayyad M, Piatetsky-Shapiro G, Smyth P eds. Advances in Knowledge Discovery and Data Mining. California:AAAI Press, 1996. 1-36 被引量:1
  • 44,Piatesket-Shapiro G. Discovery, analysis, and presentation of strong rules. In: Piatesky-Shapiro G, Frawley W J eds. Advances in Knowledge Discovery and Data Mining. Menlo Park, California:AAAI/MIT Press, 1991. 229-238 被引量:1
  • 55,Silberschatz A, Stonebraker M, Ullman J. What makes patterns interesting in knowledge discovery sysstems. IEEE Trans on Knowledge and Data Engineering, 1996, 8(6):970-974 被引量:1
  • 66,Symth P, Goodman R M. An information theoretic approach to rule induction from databases. IEEE Trans on Knowledge and Data Engineering, 1992, 4(4):301-316 被引量:1
  • 77,Toivonen H, Klemettinen M, Ronkainen P et al. Pruning and grouping discovered association rules. In: Mlnet Workshop on Statistics, Machine Learning, and Discovery in Database, Gete, Greece, 1995. 47-52 被引量:1

共引文献21

同被引文献163

引证文献16

二级引证文献98

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部