摘要
信息化的发展推动大数据时代的到来,高校数字化教学和无纸化办公将面临海量非结构化数字文档。针对海量本地文档检索问题,采用Swing、Lucene、Tika、MMSeg算法等技术,设计和研发一种全文检索桌面终端软件。该终端具有友好的用户体验,可以针对不同格式文档进行解析,实现基于词典的分词操作,利用Swing配套组件实现查询结果以网页形式显示在终端程序中,并对结果数据进行高亮处理。后期的实验数据验证终端的可使用性,其具有一定的应用价值。
The development of information technology to promote the arrival of the era of big data,digital teaching and paperless office will face a huge amounts of unstructured digital documents.In view of the huge amounts of local document retrieval problem,adopts the Swing,Lu?cene,Tika,MMseg algorithm,such as technology,designs and develops a full text retrieval desktop terminal software.The terminal has a friendly user experience,it can parse different format documents.the implementation of word segmentation operation is based on a diction?ary.Using of Swing components,query results in the form of Web pages display in the terminal program,and highlight the result data pro?cessing.In the late experimental data verify the usability of the terminal,it has certain application value.
作者
张俊飞
ZHANG Jun-fei(Guangzhou Medical University, Guangzhou 511436)
出处
《现代计算机》
2018年第22期85-90,共6页
Modern Computer
基金
国家自然科学基金青年科学基金项目(No.61603106)
2018年广州市高校创新创业教育项目(No.201709k56)
2017年广州市教育局市属高校教育教学改革项目(No.2017A05)