根据专业搜索引擎的特点,提出了一种新颖的基于词语共现与HITS算法的查询推荐算法QR-CH(Query Recom-mendation algorithm based on word Co-occurrence and HITS algorithm)。该算法一方面利用HITS算法对基于词语共现筛选出的关联词按...根据专业搜索引擎的特点,提出了一种新颖的基于词语共现与HITS算法的查询推荐算法QR-CH(Query Recom-mendation algorithm based on word Co-occurrence and HITS algorithm)。该算法一方面利用HITS算法对基于词语共现筛选出的关联词按语义关联性进行排序,选取排序靠前的关联词作为推荐词,提高了推荐词与原查询词的相关性;另一方面使用HITS算法排序关联文档,从查询结果文档集的角度来判断推荐是否冗余,降低了推荐词的冗余性。该算法将推荐相关的信息存储到知识树中,利用知识树实现查询推荐。实验结果表明QR-CH算法在推荐词的相关性和冗余词的判断方面均优于文献中已有的类似算法。展开更多
In the data retrieval process of the Data recommendation system,the matching prediction and similarity identification take place a major role in the ontology.In that,there are several methods to improve the retrieving...In the data retrieval process of the Data recommendation system,the matching prediction and similarity identification take place a major role in the ontology.In that,there are several methods to improve the retrieving process with improved accuracy and to reduce the searching time.Since,in the data recommendation system,this type of data searching becomes complex to search for the best matching for given query data and fails in the accuracy of the query recommendation process.To improve the performance of data validation,this paper proposed a novel model of data similarity estimation and clustering method to retrieve the relevant data with the best matching in the big data processing.In this paper advanced model of the Logarithmic Directionality Texture Pattern(LDTP)method with a Metaheuristic Pattern Searching(MPS)system was used to estimate the similarity between the query data in the entire database.The overall work was implemented for the application of the data recommendation process.These are all indexed and grouped as a cluster to form a paged format of database structure which can reduce the computation time while at the searching period.Also,with the help of a neural network,the relevancies of feature attributes in the database are predicted,and the matching index was sorted to provide the recommended data for given query data.This was achieved by using the Distributional Recurrent Neural Network(DRNN).This is an enhanced model of Neural Network technology to find the relevancy based on the correlation factor of the feature set.The training process of the DRNN classifier was carried out by estimating the correlation factor of the attributes of the dataset.These are formed as clusters and paged with proper indexing based on the MPS parameter of similarity metric.The overall performance of the proposed work can be evaluated by varying the size of the training database by 60%,70%,and 80%.The parameters that are considered for performance analysis are Precision,Recall,F1-score and the accuracy of data retrie展开更多
大规模数据集已经超过TB和PB级,现有的技术可以收集和存储大量的信息。虽然数据库管理系统一直在不断提高提供复杂的多种数据管理的能力,但是管理查询工具并不能满足大数据的需求,如何精准理解和探索这些大规模数据集仍然是一个巨大的...大规模数据集已经超过TB和PB级,现有的技术可以收集和存储大量的信息。虽然数据库管理系统一直在不断提高提供复杂的多种数据管理的能力,但是管理查询工具并不能满足大数据的需求,如何精准理解和探索这些大规模数据集仍然是一个巨大的挑战。交互式数据探索(interactive data exploration,IDE)的关注点是强调交互、探索和发现,能让用户从海量的数据中用最小的代价更精确地找到他们需要的信息。首先对交互式数据探索及其应用背景进行了介绍,总结了通用的探索模型和IDE的特点,分析了交互式数据探索中的查询推荐技术和查询结果优化技术的现状;随后分别对IDE原型系统进行了分析和比较;最后给出了关于交互式数据探索技术的总结和展望。展开更多
文摘根据专业搜索引擎的特点,提出了一种新颖的基于词语共现与HITS算法的查询推荐算法QR-CH(Query Recom-mendation algorithm based on word Co-occurrence and HITS algorithm)。该算法一方面利用HITS算法对基于词语共现筛选出的关联词按语义关联性进行排序,选取排序靠前的关联词作为推荐词,提高了推荐词与原查询词的相关性;另一方面使用HITS算法排序关联文档,从查询结果文档集的角度来判断推荐是否冗余,降低了推荐词的冗余性。该算法将推荐相关的信息存储到知识树中,利用知识树实现查询推荐。实验结果表明QR-CH算法在推荐词的相关性和冗余词的判断方面均优于文献中已有的类似算法。
文摘In the data retrieval process of the Data recommendation system,the matching prediction and similarity identification take place a major role in the ontology.In that,there are several methods to improve the retrieving process with improved accuracy and to reduce the searching time.Since,in the data recommendation system,this type of data searching becomes complex to search for the best matching for given query data and fails in the accuracy of the query recommendation process.To improve the performance of data validation,this paper proposed a novel model of data similarity estimation and clustering method to retrieve the relevant data with the best matching in the big data processing.In this paper advanced model of the Logarithmic Directionality Texture Pattern(LDTP)method with a Metaheuristic Pattern Searching(MPS)system was used to estimate the similarity between the query data in the entire database.The overall work was implemented for the application of the data recommendation process.These are all indexed and grouped as a cluster to form a paged format of database structure which can reduce the computation time while at the searching period.Also,with the help of a neural network,the relevancies of feature attributes in the database are predicted,and the matching index was sorted to provide the recommended data for given query data.This was achieved by using the Distributional Recurrent Neural Network(DRNN).This is an enhanced model of Neural Network technology to find the relevancy based on the correlation factor of the feature set.The training process of the DRNN classifier was carried out by estimating the correlation factor of the attributes of the dataset.These are formed as clusters and paged with proper indexing based on the MPS parameter of similarity metric.The overall performance of the proposed work can be evaluated by varying the size of the training database by 60%,70%,and 80%.The parameters that are considered for performance analysis are Precision,Recall,F1-score and the accuracy of data retrie
文摘大规模数据集已经超过TB和PB级,现有的技术可以收集和存储大量的信息。虽然数据库管理系统一直在不断提高提供复杂的多种数据管理的能力,但是管理查询工具并不能满足大数据的需求,如何精准理解和探索这些大规模数据集仍然是一个巨大的挑战。交互式数据探索(interactive data exploration,IDE)的关注点是强调交互、探索和发现,能让用户从海量的数据中用最小的代价更精确地找到他们需要的信息。首先对交互式数据探索及其应用背景进行了介绍,总结了通用的探索模型和IDE的特点,分析了交互式数据探索中的查询推荐技术和查询结果优化技术的现状;随后分别对IDE原型系统进行了分析和比较;最后给出了关于交互式数据探索技术的总结和展望。