Toggle Main Menu Toggle Search

Open Access padlockePrints

Machine learning approaches to enhance diagnosis and staging of patients with MASLD using routinely available clinical information

Lookup NU author(s): Matt McTeer, Dr Dina Tiniakos, Professor Cliff Brass, Professor Quentin AnsteeORCiD, Professor Paolo MissierORCiD

Downloads


Licence

This work is licensed under a Creative Commons Attribution 4.0 International License (CC BY 4.0).


Abstract

© 2024 McTeer et al.Aims Metabolic dysfunction Associated Steatotic Liver Disease (MASLD) outcomes such as MASH (metabolic dysfunction associated steatohepatitis), fibrosis and cirrhosis are ordinarily determined by resource-intensive and invasive biopsies. We aim to show that routine clinical tests offer sufficient information to predict these endpoints. Methods Using the LITMUS Metacohort derived from the European NAFLD Registry, the largest MASLD dataset in Europe, we create three combinations of features which vary in degree of procurement including a 19-variable feature set that are attained through a routine clinical appointment or blood test. This data was used to train predictive models using supervised machine learning (ML) algorithm XGBoost, alongside missing imputation technique MICE and class balancing algorithm SMOTE. Shapley Additive exPlanations (SHAP) were added to determine relative importance for each clinical variable. Results Analysing nine biopsy-derived MASLD outcomes of cohort size ranging between 5385 and 6673 subjects, we were able to predict individuals at training set AUCs ranging from 0.719-0.994, including classifying individuals who are At-Risk MASH at an AUC = 0.899. Using two further feature combinations of 26-variables and 35-variables, which included composite scores known to be good indicators for MASLD endpoints and advanced specialist tests, we found predictive performance did not sufficiently improve. We are also able to present local and global explanations for each ML model, offering clinicians interpretability without the expense of worsening predictive performance. Conclusions This study developed a series of ML models of accuracy ranging from 71.9—99.4% using only easily extractable and readily available information in predicting MASLD outcomes which are usually determined through highly invasive means.


Publication metadata

Author(s): McTeer M, Applegate D, Mesenbrink P, Ratziu V, Schattenberg JM, Bugianesi E, Geier A, Gomez MR, Dufour J-F, Ekstedt M, Francque S, Yki-Jarvinen H, Allison M, Valenti L, Miele L, Pavlides M, Cobbold J, Papatheodoridis G, Holleboom AG, Tiniakos D, Brass C, Anstee QM, Missier P

Publication type: Article

Publication status: Published

Journal: PLoS ONE

Year: 2024

Volume: 19

Issue: 2

Online publication date: 29/02/2024

Acceptance date: 09/02/2024

Date deposited: 18/03/2024

ISSN (electronic): 1932-6203

Publisher: Public Library of Science

URL: https://doi.org/10.1371/journal.pone.0299487

DOI: 10.1371/journal.pone.0299487

Data Access Statement: Data underpinning this study are not publicly available. The European NAFLD Registry protocol has been published in [1: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0299487#pone.0299487.ref001], including details of sample handing and processing, and the network of recruitment sites. Patient level data will not be made available due to the various constraints imposed by ethics panels across all the different countries from which patients were recruited and the need to maintain patient confidentiality. The point of contact for any enquiries regarding the European NAFLD Registry is the oversight group via email: NAFLD.Registry@newcastle.ac.uk.

PubMed id: 38421999


Altmetrics

Altmetrics provided by Altmetric


Funding

Funder referenceFunder name
777377European Commission
EFPIA
European Union Horizon 2020
Innovative Medicines Initiative 2 Joint Undertaking
NIHR
Newcastle University
Red Hat UK

Share