基于多段间隔监督度量学习的病人相似度算法

Patient Similarity Based on Supervised Metric Learning of Multi-Margin

下载PDF

导出

摘要伴随着医疗卫生服务的信息化进程推进,病人相似度成为了医疗电子健康数据的二次利用中的重要问题.在已有医疗专家对病人健康数据的评估信息下,可以将病人相似度问题转化为有监督的距离度量学习问题.通常的做法是对病人的医疗健康数据打标签来作为监督信息.在现有的病人相似度计算工作中,对监督信息的利用是很局限的;多是比较两个不同病人的标签是否完全相等来判断病人相似与否;在实际中,病人的标签往往是多个维度,这种比较忽略了标签本身的相似性.本文将病人的诊断数据作为监督信息,在度量学习中,根据标签的相似程度将目标病人的邻居区分开来,形成多段间隔,更充分地利用监督信息.在基于多标签的KNN分类评估实验中,该算法学习出的相似度度量在Hamming Loss和a-Accuracy两种指标下性能有很大提升. With the development of medical and health services informatization, patient similarity becomes an important task in reuse of Electronic Health Records（EHR）. By using the physician feedback on EHR data, patient similarity problem can be transformed to supervised distance metric learning problem, the supervised information usually comes from the tags we make on one patient＇s EHR data. In the existing work of Patient similarity Computing, the utilization of supervised is pretty circumscribed, the similarity of two different patients is often depended on their EHR data tags＇ completely equality. But in fact, the patient＇s tags contains many dimensions, that methods ignores tags＇ own similarity. In this work, we use the patient＇s diagnose data as the supervised information and divide the target patient＇s neighbor area into many margins based on their similarity using metric learning. The supervised information is also more fully used in this algorithm. Finally, in the multi-label KNN classification evaluation experiment, the similarity metric learned from this algorithm performs better than other algorithms in Hamming Loss and a-Accuracy.

作者李世强倪嘉志刘杰叶丹 LI Shi-Qiang NI Jia-Zhi LIU Jie YE Dan(SoRware Engineering Center, Institute of Software, Chinese Academy of Sciences, Beijing 100190, China University of Chinese Academy of Sciences, Beijing 100190, China)

机构地区中国科学院软件研究所软件工程技术研发中心中国科学院大学

出处《计算机系统应用》 2016年第11期164-171,共8页 Computer Systems & Applications

基金国家自然科学基金(U1435220) 军队后勤科技项目(AWS4R013)

关键词电子健康记录病人相似度监督距离度量学习多标签分类 electric health record patient similarity supervised metric learning multi-label classification

分类号 TP181 [自动化与计算机技术—控制理论与控制工程] R197.323 [自动化与计算机技术—控制科学与工程]

引文网络
相关文献

1沈媛媛,严严,王菡子.有监督的距离度量学习算法研究进展[J].自动化学报,2014,40(12):2673-2686. 被引量：24
2章东平,徐丽园.距离度量学习的摄像网络中行人重识别[J].中国计量大学学报,2016,27(4):424-428. 被引量：1
3陈开志,乐承沛,钟尚平.融合距离度量学习和SVM的图像匹配算法[J].小型微型计算机系统,2015,36(6):1353-1357. 被引量：9
4逯波,段晓东,王存睿,李泽东.基于多图像组信息的人脸识别研究[J].大连民族大学学报,2017,19(1):71-75.
5彭凯,魏岩,杨煜普.一种基于密度的大边界最近邻文本分类方法[J].计算机应用与软件,2013,30(7):83-85.
6战扬,金英,杨丰.基于监督的距离度量学习方法研究[J].信息技术,2011,35(12):21-23. 被引量：3
7杨金鸿,邓廷权.基于距离度量学习的半监督多视角谱聚类算法[J].四川大学学报（工程科学版）,2016,48(1):146-151. 被引量：2
8澳大利亚年轻一代推动医疗信息化[J].中国信息界（e医疗）,2010(8):13-13.
9胡恒.基于移动互联网的健康管理系统的架构设计[J].福建电脑,2017,33(1):134-135. 被引量：4
10吴韶益.实时数据对于抗击流感十分关键[J].商务周刊,2009(11):66-66.

计算机系统应用

2016年第11期

浏览历史

内容加载中请稍等...

基于多段间隔监督度量学习的病人相似度算法

相关作者

相关机构

相关主题

浏览历史