摘要
The needs of mitigating COVID-19 epidemic prompt policymakers to make public health-related decision under the guidelines of science.Tremendous unstructured COVID-19 publications make it challenging for policymakers to obtain relevant evidence.Knowledge graphs(KGs)can formalize unstructured knowledge into structured form and have been used in supporting decision-making recently.Here,we introduce a novel framework that can ex-tract the COVID-19 public health evidence knowledge graph(CPHE-KG)from papers relating to a modelling study.We screen out a corpus of 3096 COVID-19 modelling study papers by performing a literature assessment process.We define a novel annotation schema to construct the COVID-19 modelling study-related IE dataset(CPHIE).We also propose a novel multi-tasks document-level information extraction model SS-DYGIE++based on the dataset.Leveraging the model on the new corpus,we construct CPHE-KG containing 60,967 entities and 51,140 rela-tions.Finally,we seek to apply our KG to support evidence querying and evidence mapping visualization.Our SS-DYGIE++(SpanBERT)model has achieved a F1 score of 0.77 and 0.55 respectively in document-level entity recognition and coreference resolution tasks.It has also shown high performance in the relation identification task.With evidence querying,our KG can present the dynamic transmissions of COVID-19 pandemic in different countries and regions.The evidence mapping of our KG can show the impacts of variable non-pharmacological interventions to COVID-19 pandemic.Analysis demonstrates the quality of our KG and shows that it has the potential to support COVID-19 policy making in public health.
基金
This work was supported in part by the National Natural Science Foundation of China(Grants No.72025404 and No.71621002)
Bei-jing Natural Science Foundation(L192012)
Beijing Nova Program(Z201100006820085).