摘要
LPI对于局部流形结构是优化的,但在时空上运行效率较低,使其很难应用于大型数据集。基于LPI算法,提出了一种优化的LPI算法FLPI,它将LPI问题分解为一个图嵌入问题和一个正则最小二乘问题,避免了稠密矩阵的特征值分解,显著减少了计算复杂度。此外,在监督环境下,利用一个特别设计的图,使FLPI只需要解决正则最小二乘问题,进一步减少了时空开销。实时数据集实验结果显示,FLPI获得了相似或优于LPI的结果,且运行速度明显提升。
LPI is optimal in the sense of local manifold structure. However, LPI is not efficient in time and memory, which makes it difficult to be applied to very large data set. Therefore, an optimal algorithm called FLPI was proposed. FLPI decomposed the LPI problem into a graph embedding problem plus a regularized least squares problem. Such modification avoids eigen decomposition of dense matrices and can significantly reduce both time and memory cost in computation. Moreover, with a specifically designed graph in supervised situation, LPI only needs to solve the regularized least squares problem which is a further saving of time and memory. Experimental results on real data show that FLPI obtains similar or better results compared to LPI and it is significantly faster.
出处
《计算机应用》
CSCD
北大核心
2008年第6期1566-1569,1574,共5页
journal of Computer Applications
基金
国家自然科学基金资助项目(NSFC60273094)
宁波市自然科学基金资助项目(2006A610012)
关键词
局部保留索引
潜在语意索引
文档索引
维度归约
Locality Preserving Indexing (LPI)
Latent Semantic Indexing (LSI)
document indexing
dimensionality reduction