摘要
由于高维OLAP数据集包含的信息量过大并且质量参差不齐,导致用户在查询时难以选取合适的维度集合进行操作,从而影响了决策的效率和准确性。为此,提出将变量选择方法应用于OLAP查询推荐的过程中。为了在包含海量高维信息的OLAP仿真数据集合中识别与度量属性无关的噪声属性及彼此之间存在相关性的维度属性,从而缩小查询范围,同时保持度量属性空间划分结果的准确性,基于非参数方法设计了一种用于支持OLAP查询推荐的变量选择算法FFTB,构建了基于变量选择的OLAP查询推荐仿真模型,通过启发式方法发现与查询目标密切相关的维度,并对OLAP查询的数据环境及查询推荐过程进行了详细的仿真实验,验证了方法的可用性与有效性。仿真实验显示,变量选择方法能够在保证准确性的前提下有效地缩小OLAP查询空间,从而有效辅助决策者从大量数据中选取关键维度,达到OLAP查询推荐的目的,进而提高决策效率。
In multi-dimensional OLAP data set, there is too much information, and meanwhile the data quality is not at the same level. Due to these features of the data set, it is hard to choose the proper dimensions to operate OLAP queries, which reduce the efficiency and accuracy of decision making. To solve this problem, variable selection was introduced to OLAP query recommendation. The criteria of variable selection in OLAP query recommendation was to recognize the noise attributes which were uncorrelated to measure attributes and the correlated dimension attributes in the condition of space partitioning accuracy. A nonparametric variable selection algorithm FFTB was proposed to build the simulation model for OLAP query recommendation, by which the heuristic idea was used to recognize those dimensions closely related to the query objectives. In order to verify the availability of the simulation model, the data environment and the query recommendation procedure were simulated in the simulation experiment. The results of the experiment reveal that by this model, the OLAP query space is drastically reduced, which is helpful to recognize the key dimensions so as to improve the decision efficiency.
出处
《系统仿真学报》
CAS
CSCD
北大核心
2013年第11期2534-2539,共6页
Journal of System Simulation
基金
国家863高技术研究发展计划基金项目(2011AA040501)
国家自然科学基金项目(71271071)
合肥工业大学博士专项科研资助基金(2011HGBZ1310)
关键词
OLAP查询
高维数据
变量选择
查询推荐
仿真实验
OLAP query
high-dimensional data
variable selection
query recommendation
simulation experiment