Browse by author
Lookup NU author(s): Dr Jeffry Hogg
Full text for this publication is not currently held within this repository. Alternative links are provided below where available.
©Aditya U Kale, Henry David Jeffry Hogg, Russell Pearson, Ben Glocker, Su Golder, April Coombe, Justin Waring, Xiaoxuan Liu, David J Moore, Alastair K Denniston.Background: Artificial intelligence (AI) medical devices have the potential to transform existing clinical workflows and ultimately improve patient outcomes. AI medical devices have shown potential for a range of clinical tasks such as diagnostics, prognostics, and therapeutic decision-making such as drug dosing. There is, however, an urgent need to ensure that these technologies remain safe for all populations. Recent literature demonstrates the need for rigorous performance error analysis to identify issues such as algorithmic encoding of spurious correlations (eg, protected characteristics) or specific failure modes that may lead to patient harm. Guidelines for reporting on studies that evaluate AI medical devices require the mention of performance error analysis; however, there is still a lack of understanding around how performance errors should be analyzed in clinical studies, and what harms authors should aim to detect and report. Objective: This systematic review will assess the frequency and severity of AI errors and adverse events (AEs) in randomized controlled trials (RCTs) investigating AI medical devices as interventions in clinical settings. The review will also explore how performance errors are analyzed including whether the analysis includes the investigation of subgroup-level outcomes. Methods: This systematic review will identify and select RCTs assessing AI medical devices. Search strategies will be deployed in MEDLINE (Ovid), Embase (Ovid), Cochrane CENTRAL, and clinical trial registries to identify relevant papers. RCTs identified in bibliographic databases will be cross-referenced with clinical trial registries. The primary outcomes of interest are the frequency and severity of AI errors, patient harms, and reported AEs. Quality assessment of RCTs will be based on version 2 of the Cochrane risk-of-bias tool (RoB2). Data analysis will include a comparison of error rates and patient harms between study arms, and a meta-analysis of the rates of patient harm in control versus intervention arms will be conducted if appropriate. Results: The project was registered on PROSPERO in February 2023. Preliminary searches have been completed and the search strategy has been designed in consultation with an information specialist and methodologist. Title and abstract screening started in September 2023. Full-text screening is ongoing and data collection and analysis began in April 2024. Conclusions: Evaluations of AI medical devices have shown promising results; however, reporting of studies has been variable. Detection, analysis, and reporting of performance errors and patient harms is vital to robustly assess the safety of AI medical devices in RCTs. Scoping searches have illustrated that the reporting of harms is variable, often with no mention of AEs. The findings of this systematic review will identify the frequency and severity of AI performance errors and patient harms and generate insights into how errors should be analyzed to account for both overall and subgroup performance.
Author(s): Kale AU, Hogg HDJ, Pearson R, Glocker B, Golder S, Coombe A, Waring J, Liu X, Moore DJ, Denniston AK
Publication type: Review
Publication status: Published
Journal: JMIR Research Protocols
Year: 2024
Volume: 13
Online publication date: 28/06/2024
Acceptance date: 18/04/2024
ISSN (electronic): 1929-0748
Publisher: JMIR Publications Inc.
URL: https://doi.org/10.2196/51614
DOI: 10.2196/51614