厦门大学健康医疗大数据研究院
健康医疗大数据国家研究院 | 数字福建健康医疗大数据研究所
National Institute for Data Science in Health and Medicine,Xiamen University
语言选择: 中文简体中文版 ∷  ENGLISH英文版
最新动态 您的位置: 首页 > 新闻中心 > 最新动态 > 正文 >
我院朱建平教授数据挖掘中心团队研究成果在Nature子刊Scientific Reports发表
2022-02-04 返回列表

题目:Differentiation of intestinal tuberculosis and Crohn's disease through an explainable machine learning method

作者:翁福添,孟钰,卢放根,王玉莹,王玮玮,徐龙,程东升,朱建平*

发表期刊:Scientific Reports (JCR Q1),2022

近日,厦门大学医学院、管理学院、健康医疗大数据国家研究院、数据挖掘研究中心团队,联合深圳大学总医院消化内科团队,在Nature出版集团的期刊Scientific Reports(JCR Q1)线上刊出了题为“Differentiation of intestinal tuberculosis and Crohn's disease through an explainable machine learning method”的论文。该论文聚焦于消化内科中克罗恩病和肠结核的鉴别问题,提出一种可解释机器学习框架,对有效鉴别这两种疾病即理解机器学习如何做出预测具有重要意义。

Abstract 

Background: Differentiation between Crohn’s disease and intestinal tuberculosis isdifficult but crucial for medical decisions. This study aims to develop an effective framework to distinguish these two diseases through an explainable machine learning (ML) model. Methods: After feature selection, a total of nine variables are extracted, including intestinal surgery, abdominal, bloody stool, PPD, knot, ESAT-6, CFP-10, intestinal dilatation and comb sign. Besides, we compared the predictive performance of the ML methods with traditional statistical methods. This work also provides insights into the ML model’s outcome through the SHAP method for the first time. Results: A cohort consisting of 200 patients' data (CD = 160, ITB = 40) is used in training and validating models. Results illustrate that the XGBoost algorithm outperforms other classifiers in terms of area under the receiver operating characteristic curve (AUC), sensitivity, specificity, precision and Matthews correlation coefficient (MCC), yielding values of 0.891, 0.813, 0.969,0.867 and 0.801 respectively. More importantly, the prediction outcomes of XGBoost can be effectively explained through the SHAP method. Conclusions: The proposed framework proves that the effectiveness of distinguishing CD from ITB through interpretable machine learning, which can obtain a global explanation but also an explanation for individual patients.

论文内容详细内容链接:https://www.nature.com/articles/s41598-022-05571-7