A knowledge graph(KG),a special form of semantic network,integrates fragmentary data into a graph to support knowledge processing and reasoning.KG quality control is important to the utility of KGs.It is essential to ...A knowledge graph(KG),a special form of semantic network,integrates fragmentary data into a graph to support knowledge processing and reasoning.KG quality control is important to the utility of KGs.It is essential to investigate KG quality and the parameters influencing KG quality to better understand its quality control.Although many works have been conducted to evaluate the dimensions of KG quality,quality control of the construction process,and enhancement methods for quality,a comprehensive literature review has not been presented on this topic.This paper intends to fill this research gap by presenting a comprehensive survey on the quality control of KGs.First,this paper defines six main evaluation dimensions of KG quality and investigates their correlations and differences.Second,quality control treatments during KG construction are introduced from the perspective of these dimensions of KG quality.Third,the quality enhancement of a constructed KG is described from various dimensions.This paper ultimately aims to promote the research and applications of KGs.展开更多
Regional healthcare platforms collect clinical data from hospitals in specific areas for the purpose of healthcare management.It is a common requirement to reuse the data for clinical research.However,we have to face ...Regional healthcare platforms collect clinical data from hospitals in specific areas for the purpose of healthcare management.It is a common requirement to reuse the data for clinical research.However,we have to face challenges like the inconsistence of terminology in electronic health records (EHR) and the complexities in data quality and data formats in regional healthcare platform.In this paper,we propose methodology and process on constructing large scale cohorts which forms the basis of causality and comparative effectiveness relationship in epidemiology.We firstly constructed a Chinese terminology knowledge graph to deal with the diversity of vocabularies on regional platform.Secondly,we built special disease case repositories (i.e.,heart failure repository) that utilize the graph to search the related patients and to normalize the data.Based on the requirements of the clinical research which aimed to explore the effectiveness of taking statin on 180-days readmission in patients with heart failure,we built a large-scale retrospective cohort with 29647 cases of heart failure patients from the heart failure repository.After the propensity score matching,the study group (n=6346) and the control group (n=6346) with parallel clinical characteristics were acquired.Logistic regression analysis showed that taking statins had a negative correlation with 180-days readmission in heart failure patients.This paper presents the workflow and application example of big data mining based on regional EHR data.展开更多
基金sponsored by NSFC grants 62176245 and 62137002the Fundamental Research Funds for the Central Universities,Anhui Province grants'202104a05020011 and 202103a07020002.
文摘A knowledge graph(KG),a special form of semantic network,integrates fragmentary data into a graph to support knowledge processing and reasoning.KG quality control is important to the utility of KGs.It is essential to investigate KG quality and the parameters influencing KG quality to better understand its quality control.Although many works have been conducted to evaluate the dimensions of KG quality,quality control of the construction process,and enhancement methods for quality,a comprehensive literature review has not been presented on this topic.This paper intends to fill this research gap by presenting a comprehensive survey on the quality control of KGs.First,this paper defines six main evaluation dimensions of KG quality and investigates their correlations and differences.Second,quality control treatments during KG construction are introduced from the perspective of these dimensions of KG quality.Third,the quality enhancement of a constructed KG is described from various dimensions.This paper ultimately aims to promote the research and applications of KGs.
基金Supported by the National Major Scientific and Technological Special Project for"Significant New Drugs Development’’(No.2018ZX09201008)Special Fund Project for Information Development from Shanghai Municipal Commission of Economy and Information(No.201701013)
文摘Regional healthcare platforms collect clinical data from hospitals in specific areas for the purpose of healthcare management.It is a common requirement to reuse the data for clinical research.However,we have to face challenges like the inconsistence of terminology in electronic health records (EHR) and the complexities in data quality and data formats in regional healthcare platform.In this paper,we propose methodology and process on constructing large scale cohorts which forms the basis of causality and comparative effectiveness relationship in epidemiology.We firstly constructed a Chinese terminology knowledge graph to deal with the diversity of vocabularies on regional platform.Secondly,we built special disease case repositories (i.e.,heart failure repository) that utilize the graph to search the related patients and to normalize the data.Based on the requirements of the clinical research which aimed to explore the effectiveness of taking statin on 180-days readmission in patients with heart failure,we built a large-scale retrospective cohort with 29647 cases of heart failure patients from the heart failure repository.After the propensity score matching,the study group (n=6346) and the control group (n=6346) with parallel clinical characteristics were acquired.Logistic regression analysis showed that taking statins had a negative correlation with 180-days readmission in heart failure patients.This paper presents the workflow and application example of big data mining based on regional EHR data.