摘要
粗糙集属性应急数据存在冗余特征,降低挖掘效率,提出基于信息熵的粗糙集属性应急数据去重挖掘算法。将粗糙集理论和信息熵相结合,离散化处理应急数据,离散化完成后,约简对于决策表的条件信息熵大小不产生任何影响的属性,设定决策属性集合和条件属性集合,选取将同约简属性集合B的属性组合数目最小的熵值实现约简,去除冗余特征,完成应急数据去重挖掘。以大型船舶应急数据为研究对象展开数据去重挖掘,结果表明:可有效去重挖掘到船舶旋回性相关应急数据,利用数据增比特征能够分析到各因素对船舶旋回性的影响,并且所研究算法的挖掘效率较高,在数据量为1400条时,耗时仅为0.33 s。
The attribute emergency data of rough set has redundant features,which reduces the efficiency of mining.A re mining algorithm of attribute emergency data of rough set based on information entropy is proposed.Combining the theory of rough set and information entropy,discretize the emergency data.After discretization,the attribute which does not have any influence on the conditional information entropy of decision table is reduced.The decision attribute set and conditional attribute set are set,and the entropy value with the minimum number of attribute combinations of attribute set B is selected to realize reduction,to remove redundant features,and to complete the emergency data re excavation Dig.The results show that:it can effectively mine the emergency data related to the ship's cycle,and analyze the influence of various factors on the ship's cycle by using the increasing ratio characteristics of data,and the mining efficiency of the algorithm is high.When the data volume is 1400,the time-consuming is only 0.33s.
作者
曾维佳
秦放
李琳
徐鹏
ZENG Wei-jia;QIN Fang;LI Lin;XU Peng(School Of Digital Technology,Dalian University Of Science And Technology,Dalian,Liaoning300384,China)
出处
《计算技术与自动化》
2021年第4期64-68,共5页
Computing Technology and Automation
关键词
信息熵
粗糙集属性
应急数据
去重挖掘
离散化
约减
information entropy
rough set attribute
emergency data
re-mining
discretization
reduction