期刊文献+

一种优化的基于LPI的文档索引算法FLPI

FLPI:An optimal algorithm for document indexing based on LPI
下载PDF
导出
摘要 LPI对于局部流形结构是优化的,但在时空上运行效率较低,使其很难应用于大型数据集。基于LPI算法,提出了一种优化的LPI算法FLPI,它将LPI问题分解为一个图嵌入问题和一个正则最小二乘问题,避免了稠密矩阵的特征值分解,显著减少了计算复杂度。此外,在监督环境下,利用一个特别设计的图,使FLPI只需要解决正则最小二乘问题,进一步减少了时空开销。实时数据集实验结果显示,FLPI获得了相似或优于LPI的结果,且运行速度明显提升。 LPI is optimal in the sense of local manifold structure. However, LPI is not efficient in time and memory, which makes it difficult to be applied to very large data set. Therefore, an optimal algorithm called FLPI was proposed. FLPI decomposed the LPI problem into a graph embedding problem plus a regularized least squares problem. Such modification avoids eigen decomposition of dense matrices and can significantly reduce both time and memory cost in computation. Moreover, with a specifically designed graph in supervised situation, LPI only needs to solve the regularized least squares problem which is a further saving of time and memory. Experimental results on real data show that FLPI obtains similar or better results compared to LPI and it is significantly faster.
出处 《计算机应用》 CSCD 北大核心 2008年第6期1566-1569,1574,共5页 journal of Computer Applications
基金 国家自然科学基金资助项目(NSFC60273094) 宁波市自然科学基金资助项目(2006A610012)
关键词 局部保留索引 潜在语意索引 文档索引 维度归约 Locality Preserving Indexing (LPI) Latent Semantic Indexing (LSI) document indexing dimensionality reduction
  • 相关文献

参考文献7

  • 1XU W, LIU X, GONG Y. Document clustering based on nonnegatire matrix factorization [ C ]// Proceedings of 2003 International Conference on Research and Development in Information Retrieval (SIGIR'03). 2003:267-273. 被引量:1
  • 2HOFMANN T. Probabilistic latent semantic indexing [ C]// Proceedings of 1999 Intemational Conference on Research and Development in Information Retrieval (SIGIR'99). 1999:50-57. 被引量:1
  • 3CHUNG F R K. Spectral Graph Theory, volume 92 of Regional Conference Series in Mathematics. AMS, 1997. 被引量:1
  • 4GOLUB G H, LOAN C F V. Matrix computations [ M]. 3rd ed. Johns Hopkins University Press, 1996. 被引量:1
  • 5HASTIE T, TIBSHIRANI R, FRIEDMAN J. The elements of statistical learning: Data mining, inference, and prediction [M]. New York: Springer-Verlag, 2001. 被引量:1
  • 6PENROSE R. A generalized inverse for matrices [C]//Proceedings of the Cambridge Philosophical Society. 1955, 51:406-413. 被引量:1
  • 7CHANG C-C, LIN C J. LIBSVM: A library for support vector machines [ EB/OL]. http://www. csie. ntu. edu. tw/-cjlin/libsvm. 被引量:1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部