摘要
[目的/意义]专利信息是人类科学技术进步的结晶,随着社会的发展,专利信息将为促进科技创新发挥日益重要的作用。利用聚类技术可以将海量专利信息进行自动分类,在实现信息有序归并管理的同时,有助于用户高效而全面的获取相关技术领域中的集成专利信息,具有重要的现实意义,传统聚类研究方法效率与准确度存在不足。[方法/过程]本文通过对专利信息服务网站(中国科学院知识产权网)访问日志数据的清洗与分析,生成专利信息点击序列,基于深度学习词嵌入模型,设计了Patent Freq2Vec模型,计算得出专利关联信息。[结果/结论]利用Patent Freq2Vec模型分析计算访问日志数据,能够得到关联专利信息,实现专利聚类,且聚类准确度高于传统方法。
[Purpose/Significance]Patent information is the fruit of the progress of science and technology.With the development of society,patent information will play an increasingly important role in promoting scientific and technological innovation.Through patent clustering analysis,it is possible to aggregate isolated information according to different aggregation degree,so that they can be transformed from ordinary information to valuable Patent Competitive intelligence.The traditional clustering methods have some efficiency and accuracy problems.[Method/Process]Based on cleaning and analysis access log data of the patent information service website(Intellectual property network of the Chinese Academy of Sciences),the sequence data of patent clicking were generated and input into the PatentFreq2Vec model based on word embedding to obtaine patent related information with the learning algorithm.[Result/Conclusion]This could cluster the patents and improve accuracy of the patent clustering.
作者
文奕
陈文杰
张鑫
杨宁
赵爽
Wen Yi ;Chen Wenjie ;Zhang Xin ;Yang Ning ;Zhao Shuang(Chengdu Library of the Academy of Sciences,Chengdu 610041,China)
出处
《现代情报》
CSSCI
2018年第4期112-117,共6页
Journal of Modern Information
基金
中国科学院文献情报能力建设专项项目"情报计算分析框架体系建设"(项目编号:P21760)
关键词
专利
聚类
深度学习
词嵌入
访问日志
patent
clustering
deep learning
Word Embedding
access log