摘要
【目的/意义】从开放政府数据主题的多个政策文本的语义挖掘出发,发现多个政策文本内容间的语义关系,探索能降低人工干预,实现多政策文本协同性自动化分析的方法。【方法/过程】利用数据挖掘的关联规则算法对经过预处理的开放政府数据政策文本进行语义挖掘,按照得到的有效强关联分析多政策文本间的协同性。【结果/结论】以开放政府数据主题的多个政策文本为研究对象,确定置信度为0.7,提升度大于3时得到的有效强关联规则数量较稳定;经过不同层次的政策文本关联规则分析,可以得到与人工分析基本吻合的结论,验证了该方法可以应用于多政策文本语义协同性的定量研究。【创新/局限】采用数据挖掘中的关联规则算法完成数据政策多文本的协同性知识推理研究,有效的实现了语义自动化计算的问题。实验中政策词表的完整性、数据预处理过程、参数设定等环节都会对实验结果准确性有影响,需进一步降低人工干预影响。
【Purpose/significance】Starting from the semantic mining of multiple policy texts of open government data,the semantic relationship among multiple policies is found and a method to reduce manual intervention and automate the analysis of multi-policy text synergies is explored.【Method/process】The data mining association rule algorithm is used to semantically mine the pre-processed policy text of open government data,and the synergy between multiple policy texts is analyzed according to the obtained effective strong association.【Result/conclusion】Taking multiple policy texts of the open government data as the research object,the obtained number of effective strong association rules when the confidence is 0.7 and the lift is greater than 3 is relatively stable.After different levels of policy text association rules analysis,it can get the conclusion basically consistent with the manual analysis.The conclusions prove that the method can be applied to the quantitative research of multi-policy text semantic synergy.【Innovation/limitation】The association rule algorithm is used to complete the knowledge reasoning research of multi-policy text semantic synergy and realize the problem of semantic automatic calculation.In the experiment the integrity of the policy vocabulary data preprocessing process parameter setting and other aspects will have an impact on the accuracy of the experimental results and the impact of manual intervention needs to be further reduced.
作者
马海群
刘兴丽
韩娜
MA Hai-qun;LIU Xing-li;HAN Na(Research Center of Information Resources Management,Harbin 150080,China;School of Computer and Information Engineering,Heilongjiang University of Science and Technology,Harb in 150020,China)
出处
《情报科学》
CSSCI
北大核心
2022年第4期3-8,17,共7页
Information Science
基金
国家社科重大项目“面向数字化发展的公共数据开放利用体系与能力建设研究”(21&ZD336)。
关键词
关联规则
多政策协同
开放政府数据
文本语义
定量研究
association rules
multi-policy synergy
open government data
text semantic
quantitative research