摘要
相关性分析因其能快速发现数据间潜在的关系而变得越来越重要了.在现实生活中,人们经常要分析多变量间的相关性大小.鉴于此,提出一种能够度量多变量间相关关系的度量方法——多变量间的最大互信息系数(Multi-variable Maximal Mutual Information Coefficient, Mv_MMIC),该方法能够探测多变量间广泛的相关关系,这里的广泛相关关系包括线性和非线性的函数型关系,甚至所有的函数型关系.首先利用最大互信息系数MIC (Mutual Information Coefficient)构建最大互信息系数矩阵,然后基于矩阵的特征分解原理,利用最大互信息系数矩阵的特征值构建出度量多变量间相关关系的度量方法,把度量两个随机变量间的相关关系的方法MIC巧妙地从两纬度的度量准则推广到度量多变量间的相关性的多维度度量准则中,最后通过实验证明:多变量间的最大互信息系数Mv_MMIC保留了MIC的通用性和公平性的优点,具有一定的理论研究和实际应用价值.
Correlation analysis is increasingly important because it can quickly identify the potential relationships among the data.People often have to identify the correlation among multi variables in their lives.In view of it,this thesis present a measure of dependence for multi-variable relationships:the multi-variable maximal mutual information coefficient( Mv_MMIC).Mv_MMIC can capture a wide range of associations,not limited tospecific function types(such as linear,exponential,or periodic),or even to all functional relationships.First,we use MMIC to build the maximum mutual information coefficient matrix.Second,we utilize the value of maximal mutual information coefficient matrix to construct a metric method for measuring the correlation between multiple variables based on the feature decomposition principle of matrix.Then,MIC measuring the correlation between two random variables is generalized to the correlation analysis of multi variables,which has certain theoretical research and practical application value.At last,our experiments show that Mv_MMIC still retains the generality and equitability,which has certain theoretical research and practical application value.
作者
张朝霞
吴杰
ZHANG Zhaoxia;WU Jie(Department of Computer Science,Taiyuan Normal University,Jinzhong 030619,China;Graduate Department,Taiyuan Normal University,Jinzhong 030619,China)
出处
《太原师范学院学报(自然科学版)》
2022年第1期34-40,共7页
Journal of Taiyuan Normal University:Natural Science Edition
基金
国家社科基金项目(20BJL080)
山西省重点研发计划项目(201803D121088)
太原师范学院教学改革项目校级教学改革项目(JGLX1826)。
关键词
多变量相关
非线性相关
最大信息系数
多变量间的最大互信息系数
multi variable correlation
nonlinear correlation
maximum information coefficient
multi-variable maximal mutual information coefficient