实验室信息管理系统(Laboratory Information Management System,LIMS)在环境监测领域中,已经逐渐地得到了普及,在LIMS数据库中涵盖了大量的环境监测信息和数据,与传统的数据统计方法相比,通过SQL语句对LIMS数据库进行SELECT操作,能够...实验室信息管理系统(Laboratory Information Management System,LIMS)在环境监测领域中,已经逐渐地得到了普及,在LIMS数据库中涵盖了大量的环境监测信息和数据,与传统的数据统计方法相比,通过SQL语句对LIMS数据库进行SELECT操作,能够大大提高统计的效率和准确性。展开更多
面对海量的兵棋数据,传统界面查询的方式已经无法满足指挥员快速、全面、精准查询数据的要求。通过深入分析兵棋数据特点与主流NL2SQL(natural language to struct query language)模型的缺陷,提出了一套适合兵棋数据智能统计查询的解...面对海量的兵棋数据,传统界面查询的方式已经无法满足指挥员快速、全面、精准查询数据的要求。通过深入分析兵棋数据特点与主流NL2SQL(natural language to struct query language)模型的缺陷,提出了一套适合兵棋数据智能统计查询的解决方案。针对领域数据集缺乏,提出了一套基于人机协助、动态迭代的兵棋数据集构建方案;针对兵棋查询问句时间敏感的问题,提出了一套“规则+深度学习”的时间表达式识别与规范方法;针对兵棋数据量大提取查询值困难的问题,修改完善了Bridge模型的值提取与SQL生成架构。综合运用以上方案,使兵棋数据查询的精准匹配准确率达到75%以上。展开更多
Information retrieval (IR) systems are designed to help information seekers retrieving relevant information from vast document. The need for relevant information from a vast amount of document gave birth to IR systems...Information retrieval (IR) systems are designed to help information seekers retrieving relevant information from vast document. The need for relevant information from a vast amount of document gave birth to IR systems. Even though different IR systems exist, they cannot meet all users’ expectations. A different level of users’ knowledge makes queries to be expressed in different ways. As a result, the system may miss the core meaning of users query and retrieve dissatisfactory results. This happens mainly because of the ambiguities of words involved in the natural languages and expression mismatch among users and authors. The existing ambiguities in Amharic language have negative impacts on the performance of Amharic IR system. Some of the ambiguities for this type of problem are: spelling variants of the same word, polysemous and synonymous terms. If users are not fully knowledgeable about the information domain area, they will mostly formulate weak queries to retrieve documents. Thus, they end up frustrated with the results found from an IR system. This research has been conducted, aiming at augmenting the recall of previous work. Statistical co-occurrence technique has been used in order to expand query terms. The main reason for performing query expansion is to provide relevant documents as per users’ query that can satisfy their information need. Statistical co-occurrence method considers, frequently appearing terms with the query term, regardless of their position. The efficiency of proposed technique has been tested on the prototype system and the result found compared with the result of previous study. Accordingly, 6% recall and 2% f-measure improvement has been made. Hence, the statistical co-occurrence method outperformed the bi-gram based IR system.展开更多
文摘实验室信息管理系统(Laboratory Information Management System,LIMS)在环境监测领域中,已经逐渐地得到了普及,在LIMS数据库中涵盖了大量的环境监测信息和数据,与传统的数据统计方法相比,通过SQL语句对LIMS数据库进行SELECT操作,能够大大提高统计的效率和准确性。
文摘面对海量的兵棋数据,传统界面查询的方式已经无法满足指挥员快速、全面、精准查询数据的要求。通过深入分析兵棋数据特点与主流NL2SQL(natural language to struct query language)模型的缺陷,提出了一套适合兵棋数据智能统计查询的解决方案。针对领域数据集缺乏,提出了一套基于人机协助、动态迭代的兵棋数据集构建方案;针对兵棋查询问句时间敏感的问题,提出了一套“规则+深度学习”的时间表达式识别与规范方法;针对兵棋数据量大提取查询值困难的问题,修改完善了Bridge模型的值提取与SQL生成架构。综合运用以上方案,使兵棋数据查询的精准匹配准确率达到75%以上。
文摘Information retrieval (IR) systems are designed to help information seekers retrieving relevant information from vast document. The need for relevant information from a vast amount of document gave birth to IR systems. Even though different IR systems exist, they cannot meet all users’ expectations. A different level of users’ knowledge makes queries to be expressed in different ways. As a result, the system may miss the core meaning of users query and retrieve dissatisfactory results. This happens mainly because of the ambiguities of words involved in the natural languages and expression mismatch among users and authors. The existing ambiguities in Amharic language have negative impacts on the performance of Amharic IR system. Some of the ambiguities for this type of problem are: spelling variants of the same word, polysemous and synonymous terms. If users are not fully knowledgeable about the information domain area, they will mostly formulate weak queries to retrieve documents. Thus, they end up frustrated with the results found from an IR system. This research has been conducted, aiming at augmenting the recall of previous work. Statistical co-occurrence technique has been used in order to expand query terms. The main reason for performing query expansion is to provide relevant documents as per users’ query that can satisfy their information need. Statistical co-occurrence method considers, frequently appearing terms with the query term, regardless of their position. The efficiency of proposed technique has been tested on the prototype system and the result found compared with the result of previous study. Accordingly, 6% recall and 2% f-measure improvement has been made. Hence, the statistical co-occurrence method outperformed the bi-gram based IR system.