摘要
针对互联网舆情管控领域信息量大,时效性强,往往偏重于某些方向,如社会热点、焦点,或反动、黄色言论等的特点,文中把基于密度的聚类思想引入传统K-Means算法,提出全新的DK聚类算法,并且基于DK算法构建中文文本聚类模型,重点对互联网媒体发布信息进行主动热点发现研究。用实验验证中文聚类模型的具体性能,证实了该模型的有效性和实用性。
In the information booming era, Intemet informtion control and supervision always need to deal with numerous update information and focusc on some specific areas such as social focus, hot topics, anti - social statement and pomo information. Considering all these features, create a Chinese text clustering model and specialized in Interact information hotspots discovery on initiative. It proposes the density based DK solution also combined the strength of K - Means algorithm and the feasibility is justified in the experiment.
出处
《计算机技术与发展》
2008年第9期1-4,共4页
Computer Technology and Development
基金
上海市科委"登山行动计划"信息技术领域重点项目(065115020)
国家自然科学基金项目(60502032)
关键词
K-MEANS
DK
中文文本聚类
舆情管控
K- Means
DK
Chinese text cluster
information control and supervision