摘要
频繁高效用项集挖掘是数据挖掘的一项重要任务,挖掘到的项集由支持度和效用这2个指标衡量.在一系列用于解决这类问题的方法中,进化多目标方法能够提供1组高质量解以满足不同用户的需求,避免传统算法中支持度和效用的阈值难以确定的问题.但是已有多目标算法多采用0-1编码,使得决策空间的维度与数据集中项数成正比,因此,面对高维数据集会出现维度灾难问题.鉴于此,设计一种项集归减策略,通过在进化过程中不断对不重要项进行归减以减小搜索空间.基于此策略,进而提出一种基于项集归减的高维频繁高效用项集挖掘多目标优化算法(IR-MOEA),并针对可能存在的归减过度或未归减到位的个体提出基于学习的种群修复策略用以调整进化方向.此外还提出一种基于项集适应度的初始化策略,使得算法在进化初期生成利于后期进化的稀疏解.多个数据集上的实验结果表明,所提出算法优于现有的多目标优化算法,特别是在高维数据集上.
Frequent and high utility itemset mining is an important task in data mining,and the mined itemsets are measured by two metrics,support and utility.Among a series of methods used to solve such problems,evolutionary multi-objective methods provide a set of high-quality solutions to meet the needs of different users,as well as avoiding the problem of difficulty in determining the thresholds of support and utility in traditional algorithms.The existing multiobjective algorithms are encoded with 0-1 and the dimensionality of the decision space is proportional to items in the dataset.As a result,the curse of the dimensionality problem can occur in high-dimensional datasets.Therefore,this paper designs an itemset reduction strategy to reduce the search space by reducing the unimportant items.According to this strategy,the paper proposes a high-dimensional frequent and high utility multi-objective evolutionary algorithm for itemset mining based on itemset reduction(IR-MOEA),where a learning-based population restoration strategy is proposed to adjust the evolutionary direction for over-reduced or under-reduced individuals.In addition,an initialization strategy is proposed to generate sparse solutions that facilitate evolution.Finally,experimental results on datasets show that this algorithm outperforms the existing state-of-the-art multi-objective optimization algorithms for mining frequent and high utility itemsets,especially on high-dimensional datasets.
作者
张磊
李柳
杨海鹏
孙翔
程凡
孙晓燕
苏喻
ZHANG Lei;LI Liu;YANG Hai-peng;SUN Xiang;CHENG Fan;SUN Xiao-yan;SU Yu(School of Computer Science and Technology,Anhui University,Hefei 230039,China;School of Aritificial Intelligence,Anhui University,Hefei 230039,China;School of Information and Control Engineering,China University of Mining and Technology,Xuzhou 221116,China;School of Computer,Hefei Normal University,Hefei 230001,China;Institute of Artificial Intelligence,Hefei Comprehensive National Science Center,Hefei 230071,China)
出处
《控制与决策》
EI
CSCD
北大核心
2023年第10期2832-2840,共9页
Control and Decision
基金
国家自然科学基金项目(61976001,62076001,61876184)
安徽省教育厅高校优秀人才支持计划重点项目(gxyqZD2021089)
安徽省自然科学基金项目(2008085QF309)
安徽省高校协同创新项目(GXXT-2020-050)。
关键词
多目标优化
进化算法
归减策略
修复策略
multi-objective optimization problems
evolutionary algorithm
reduced strategy
repairing strategy