摘要
首先定义了字符属性例子空间中合取规则的可学习性,通过将正例集合划分为多个子集,其中每个子集在全体反例集合上均是合取规则可学习的,并建立了命题规则的一般学习模型.然后,提出了三种正例集合的自动聚类和划分方法:相似性度量、差异度量和规则长度等,并设计了一种快速的合取规则学习方法.同时,基于最小覆盖率和最小错误率给出了一种克服过学习问题的后处理方法.最后,针对一组典型的学习问题进行了实验计算,并与已有算法进行了对比分析.
In this paper, we define the conjunctive rules's learnability on nominal-attribute instances space, and set up a propositional concept learning model by clustering positive instances into multiple divisions. All divisions are conjunctive rules learnable against the total negative instances set. Three measures are introduced to guide the clustering process, and a procedure to generate CNF(conjunctive normal form) rules for clusters are formed. A post pruning procedure is designed to deal with the overfitting problem, and two criteria that are the minimum covering rate and the minimum error rate are defined. Experiments are implemented on several data sets, and the performance of the proposed method is analyzed and compared with existing algorithms.
出处
《系统工程学报》
CSCD
2004年第5期482-488,共7页
Journal of Systems Engineering
基金
高等学校博士学科点专项科研基金资助项目(20020056047).
关键词
约束聚类
概念学习
合取规则可学习性
过学习
后处理
机器学习
propositional concept learning
constrained clustering
conjunctive rules learnability
overfitting
post pruning