摘要
目的:采用人工神经网络(arti cial neural network,ANN)的方法,建立基于问卷调查的乳腺癌风险预测模型,为乳腺癌初筛提供有效工具。方法:基于2008—2012年上海市闵行区一项有组织筛查项目,以15 148名35~74岁既往无乳腺癌史的户籍女性为对象,通过调查问卷收集人口学、月经生育史、家族史和疾病史等资料。根据病理学检查结果确诊新发乳腺癌病例66例。采用Logistic回归模型后退法筛选变量,采用前馈神经网络和有限存储BFGS(limited memory Broyden-Fletcher-Goldfarb-Shanno,L-BFGS)算法建立并检验模型。结果:模型入选变量为年龄,初潮年龄,家族史,乳房硬块、硬结或增厚,初产距筛查年数以及每周肥肉摄入次数。训练集中ANN模型的精确度为66.5%[95%可信区间(con denceinterval,CI):65.6~67.4],灵敏度为63.8%(95%CI:50.1~77.6),特异度为66.5%(95%CI:65.6~67.4),受试者工作特征曲线下面积(area under receiver operating characteristic curve,AUC)为0.706(95%CI:0.635~0.777)。测试集中的精确度为64.9%(95%CI:63.5~66.3),灵敏度为79.0%(95%CI:60.6~97.3),特异度为64.8%(95%CI:63.4~66.2),AUC为0.762(95%CI:0.655~0.869)。结论:基于问卷调查的ANN模型对上海女性乳腺癌风险预测具有一定效果,可用于该人群的风险自测和大规模初筛。
Objective: To develop a breast cancer risk predictive model based on questionnaire survey data using artificial neural network (ANN) approach, and thus to provide an effective tool for initial screening of breast cancer.Methods: During the period of 2008-2012, an organized breast cancer screening project was conducted among 15 148 healthy women at age of 35-74 years in Minhang District of Shanghai, China. The information on demographic characteristics, reproductive factors, history of any breast diseases, and family history of breast cancer was collected by in-person interview using a structured questionnaire. Sixtysix breast cancer cases were identified through pathological examination. Logistic backward regression was used to select significant risk factors. An ANN model was developed and tested by Feed-forward Networks and limited memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS) method.Results: The variables including age, age at menarche, family history of breast cancer, breast lumps, nodules or thickening, years since first delivery, and days of fatty meat intake per week were included in ANN model. In the training set, the model achieved an accuracy of 66.5% [95% confidence interval (CI): 65.6-67.4], a sensitivity of 63.8% (95% CI: 50.1-77.6), a specificity of 66.5% (95% CI: 65.6-67.4), and the area under receiver operating characteristic curve (AUC) of 0.706 (95% CI: 0.635-0.777). In the test setting, the model had an accuracy of 64.9% (95% C/: 63.5-66.3), a sensitivity of 79.0% (95% CI: 60.6-97.3), a specificity of 64.8% (95% CI: 63.4-66.2) and an AUC of 0.762 (95% CI: 0.655-0.869).Conclusion: The ANN model based on questionnaire survey data has predictive value of breast cancer risk in Chinese women in Shanghai, and has potential to be used in risk self- assessment and preliminary screening in population.
作者
李小强
莫淼
吴菲
柳光宇
徐望红
邵志敏
LI Xiaoqiang;MO Miao;WU Fei;LIU Guangyu;XU Wanghong;SHAO Zhimin(Department of Epidemiology,School of Public Health,Fudan University;Key Laboratory of Public Health Safety,Ministry of Education(Fudan University),Shanghai 200032,China;2.Department of Breast Surger;Fudan University Shanghai Cancer Center,Department of Oncology,Shanghai Medical College,Fudan University,Shanghai 200032,China)
出处
《肿瘤》
CAS
CSCD
北大核心
2018年第9期883-893,共11页
Tumor
基金
1.美国中华医学基金会资助(编号:CMB09-991)2.上海市第四轮公共卫生三年行动计划重点学科建设课题(编号:15GWZK0801)
关键词
乳腺肿瘤
神经网络(计算机)
调查和问卷
预测模型
Breast neoplasms
Neural networks (computer)
Surveys and questionnairesPredictive model