Journal: PLOS Digital Health

Loading...

Abbreviation

Publisher

PLOS

Journal Volumes

ISSN

2767-3170

Description

Search Results

Publications1 - 10 of 13
  • Blasimme, Alessandro; Nittas, Vasileios; Daniore, Paola; et al. (2023)
    PLOS Digital Health
    Machine learning has become a key driver of the digital health revolution. That comes with a fair share of high hopes and hype. We conducted a scoping review on machine learning in medical imaging, providing a comprehensive outlook of the field’s potential, limitations, and future directions. Most reported strengths and promises included: improved (a) analytic power, (b) efficiency (c) decision making, and (d) equity. Most reported challenges included: (a) structural barriers and imaging heterogeneity, (b) scarcity of well-annotated, representative and interconnected imaging datasets (c) validity and performance limitations, including bias and equity issues, and (d) the still missing clinical integration. The boundaries between strengths and challenges, with cross-cutting ethical and regulatory implications, remain blurred. The literature emphasizes explainability and trustworthiness, with a largely missing discussion about the specific technical and regulatory challenges surrounding these concepts. Future trends are expected to shift towards multi-source models, combining imaging with an array of other data, in a more open access, and explainable manner.
  • Andreoletti, Mattia; Haller, Luana; Vayena, Effy; et al. (2024)
    PLOS Digital Health
    In the evolving landscape of digital medicine, digital biomarkers have emerged as a transfor mative source of health data, positioning them as an indispensable element for the future of the discipline. This necessitates a comprehensive exploration of the ethical complexities and challenges intrinsic to this cutting-edge technology. To address this imperative, we con ducted a scoping review, seeking to distill the scientific literature exploring the ethical dimen sions of the use of digital biomarkers. By closely scrutinizing the literature, this review aims to bring to light the underlying ethical issues associated with the development and integra tion of digital biomarkers into medical practice.
  • Ferretti, Agata; Vayena, Effy; Blasimme, Alessandro (2023)
    PLOS Digital Health
    As digital technologies such as smartphones and fitness bands become more ubiquitous, individuals can engage in self-monitoring and self-care, gaining greater control over their health trajectories along the life-course. These technologies appeal particularly to young people, who are more familiar with digital devices. How this digital transformation facilitates health promotion is therefore a topic of animated debate. However, most research to date focuses on the promise and peril of digital health promotion (DHP) in high-income settings, while DHP in low- and middle-income countries (LMICs) remain largely unexplored. This narrative review aims to fill this gap by critically examining key ethical challenges of implementing DHP in LMICs, with a focus on young people. In the existing literature, we identified potential impediments as well as enabling conditions. Aspects to consider in unlocking the potential of DHP include (1) addressing the digital divide and structural injustice in data-related practices; (2) engaging the target population and responding to their specific needs given their economic, cultural, and social contexts; (3) monitoring the quality and impact of DHP over time; and (4) improving responsible technology governance and its implementation. Addressing these concerns could result in meaningful health benefits for populations lacking access to more conventional healthcare resources.
  • Amann, Julia; Vetter, Dennis; Blomberg, Stig Nikolaj; et al. (2022)
    PLOS Digital Health
    Explainability for artificial intelligence (AI) in medicine is a hotly debated topic. Our paper pres- ents a review of the key arguments in favor and against explainability for AI-powered Clinical Decision Support System (CDSS) applied to a concrete use case, namely an AI-powered CDSS currently used in the emergency call setting to identify patients with life-threatening cardiac arrest. More specifically, we performed a normative analysis using socio-technical scenarios to provide a nuanced account of the role of explainability for CDSSs for the concrete use case, allowing for abstractions to a more general level. Our analysis focused on three layers: technical considerations, human factors, and the designated system role in decision-making. Our findings suggest that whether explainability can provide added value to CDSS depends on several key questions: technical feasibility, the level of validation in case of explainable algorithms, the characteristics of the context in which the system is implemented, the designated role in the decision-making process, and the key user group(s). Thus, each CDSS will require an individualized assessment of explainability needs and we provide an example of how such an assessment could look like in practice.
  • Engler, Ines M.; Langer, Nicolas (2025)
    PLOS Digital Health
    Categorical diagnostic systems for psychopathology, such as the DSM and ICD, have long been criticized for their limited validity and reliability. Dimensional models, like the Hierarchical Taxonomy of Psychopathology (HiTOP), offer an alternative by focusing on transdiagnostic dimensions that better capture the complexity of mental health disorders. While HiTOP’s internalizing spectrum has been studied extensively in adults, its applicability and structure in children and adolescents remain less clear. Further, understanding sociopsychological indicators associated with internalizing dimensions in this age group could improve developmental psychopathological interventions. We analyzed data from 4,142 participants aged 5–21 (65.7% male; mean age = 10.46) from the Healthy Brain Network. Using exploratory and confirmatory factor analyses, we tested the internalizing structure proposed by HiTOP, with an additional focus on invariance across sex, age, and diagnostic groups. The hierarchical structure was tested through hierarchical CFA and the extended Bass-Ackward method. Structural equation modeling (SEM) examined latent factor relationships, and sociopsychological variables associated with the factors. A four-factor structure was identified: Distress, Nervousness, Social Fears, and Obsessions and Compulsions (OC). The model demonstrated partial invariance and strong fit indices. Sociopsychological variables, i.e., predictors and a quality of life indicator of the factors, included parental attitudes, discipline, bullying, and daily functioning. DSM categories and CBCL scores mapped well onto the latent factors. These findings suggest the potential clinical utility of a dimensional model for internalizing disorders in youth. Future studies should further examine the role of sociodemographic factors on dimensional constructs, explore predictive developmental trajectories longitudinally, and verify the structure of all HiTOP spectra across age groups to advance dimensional models in pediatric psychopathology research, as well as their implementation in clinical practice.
  • Zheng, Xiaochen; Allam, Ahmed; Schürch, Manuel; et al. (2026)
    PLOS Digital Health
    The identification of phenotypes within complex diseases is a fundamental component of personalized medicine, which aims to adapt healthcare to individual patient characteristics. Postoperative delirium (POD) is a complex neuropsychiatric condition with significant heterogeneity in its clinical manifestations and underlying pathophysiology. We hypothesize that POD comprises several distinct phenotypes, which cannot be directly observed in clinical practice. Identifying these phenotypes could enhance our understanding of POD pathogenesis and facilitate the development of targeted prevention and treatment strategies. In this paper, we propose an approach that combines supervised machine learning for personalized POD risk prediction with unsupervised clustering technique to uncover potential POD phenotypes. We first demonstrate our approach using synthetic data, where we simulate patient cohorts with predefined phenotypes based on distinct sets of informative features. We aim to mimic any clinical disease with our synthetic data generation method. By training a predictive model and computing SHapley Additive exPlanations (SHAP), we show that clustering patients in the SHAP feature scoring space successfully recovers the true underlying phenotypes, outperforming clustering in the raw feature space. We then present a case study using real-world data from a cohort of elderly POD patients. We train machine learning models on heterogeneous electronic health record data covering the preoperative, intraoperative and postoperative stages to predict personalized POD risk. Subsequent clustering of patients based on their SHAP feature scores reveals distinct subgroups with differing clinical characteristics and risk profiles, potentially representing POD phenotypes. These results showcase the utility of our approach in uncovering clinically relevant subtypes of complex disorders like POD, paving the way for more precise and personalized treatment strategies.
  • Haag, Christina; Steinemann, Nina; Chiavi, Deborah; et al. (2023)
    PLOS Digital Health
    The emergence of new digital technologies has enabled a new way of doing research, including active collaboration with the public (‘citizen science’). Innovation in machine learning (ML) and natural language processing (NLP) has made automatic analysis of large-scale text data accessible to study individual perspectives in a convenient and efficient fashion. Here we blend citizen science with innovation in NLP and ML to examine (1) which categories of life events persons with multiple sclerosis (MS) perceived as central for their MS; and (2) associated emotions. We subsequently relate our results to standardized individual-level measures. Participants (n = 1039) took part in the’My Life with MS’ study of the Swiss MS Registry which involved telling their story through self-selected life events using text descriptions and a semi-structured questionnaire. We performed topic modeling (‘latent Dirichlet allocation’) to identify high-level topics underlying the text descriptions. Using a pre-trained language model, we performed a fine-grained emotion analysis of the text descriptions. A topic modeling analysis of totally 4293 descriptions revealed eight underlying topics. Five topics are common in clinical research: ‘diagnosis’, ‘medication/treatment’, ‘relapse/child’, ‘rehabilitation/wheelchair’, and ‘injection/symptoms’. However, three topics, ‘work’, ‘birth/health’, and ‘partnership/MS’ represent domains that are of great relevance for participants but are generally understudied in MS research. While emotions were predominantly negative (sadness, anxiety), emotions linked to the topics ‘birth/health’ and ‘partnership/MS’ was also positive (joy). Designed in close collaboration with persons with MS, the ‘My Life with MS’ project explores the experience of living with the chronic disease of MS using NLP and ML. Our study thus contributes to the body of research demonstrating the potential of integrating citizen science with ML-driven NLP methods to explore the experience of living with a chronic condition.
  • Kijewski, Sara; McBride, Claire; Owens, Eric; et al. (2026)
    PLOS Digital Health
    Decentralized clinical trials (DCTs), particularly in the U.S., gained substantial attention during the COVID-19 pandemic, enabling trial activities to be conducted from participants' homes or local healthcare facilities despite restrictions and lockdowns. Regardless of the growth in interest, many facets of the DCT landscape remain unexplored or nascent in their development. This study aims to explore the key characteristics and development of the U.S.-registered DCT landscape, adoption patterns across various clinical contexts, and the role of digital technologies. We analyzed 1370 decentralized trials from ClinicalTrials.gov, collected using a broad DCT-keyword search. The data were screened and coded manually, and analyzed descriptively for temporal trends, purpose of decentralization, intervention type, geographic representation, and digitalization. Our findings align with previous reports of a growing, heterogeneous landscape of DCTs, with behavioral interventions appearing more suitable for decentralization than other types of interventions. Notably, most DCTs still focus on evaluating decentralized methods rather than merely implementing them in their investigations. Often, studies integrate digital tools either as the interventions themselves or to enable the digital delivery of study activities. Although the trial registry used is U.S.-based, and a U.S. partner is part of more than 50% of the studies identified, many trials are done in multiple countries or countries outside of the U.S. (42%). Among these trials, the data revealed considerable differences, with digitalized DCTs in this sample concentrated in high-income countries. Despite rapid growth in DCTs, our findings suggest the presence of a field in development, very much focused on establishing a methodological foundation. To unlock the potential of DCTs locally and globally, four critical areas demand further attention: digital equity, regulatory frameworks for diverse technologies, establishment of methodological validation processes, and further research on barriers to implementation.
  • Cajas Ordóñez, Sebastián Andrés; Castro, Rowell; Celi, Leo Anthony; et al. (2026)
    PLOS Digital Health
    Contemporary medical AI systems exhibit a critical vulnerability: they deliver confident predictions without mechanisms to express uncertainty or acknowledge limitations, leading to dangerous overreliance in clinical settings. This paper introduces the BODHI (Bridging, Open, Discerning, Humble, Inquiring) framework, a dual-reflective architecture grounded in two essential epistemic virtues: curiosity and humility, as foundational design principles for healthcare AI. Curiosity drives systems to actively explore diagnostic uncertainty, seek additional information when faced with ambiguous presentations, and recognize when training distributions fail to match clinical reality. Humility provides complementary restraint, enabling uncertainty quantification, boundary recognition, and appropriate deference to human expertise. We demonstrate how these virtues function synergistically in a dynamic feedback loop, preventing both reckless exploration and excessive caution while supporting collaborative clinical decision-making. Drawing from psychological theories of curiosity and cross-species evidence of epistemic humility, we argue that these capacities represent fundamental biological design principles essential for systems operating in high-stakes, uncertain environments. The BODHI framework addresses systemic failures in medical AI deployment, from biased training data to institutional workflow pressures, by embedding uncertainty awareness and collaborative restraint into foundational system architecture. Key implementation features include calibrated confidence measures, out-of-distribution detection, curiosity-driven escalation protocols, and transparency mechanisms that adapt to clinical context. Rather than pursuing algorithmic perfection through pure optimization, we advocate for human-AI partnerships that enhance clinical reasoning through mutual accountability and calibrated trust. This approach represents a paradigm shift from overconfident automation toward collaborative systems that embody the wisdom to pause, reflect, and defer when appropriate.
  • Hinrichs, Nils; Roeschl, Tobias; Lanmueller, Pia; et al. (2024)
    PLOS Digital Health
    Patients in an Intensive Care Unit (ICU) are closely and continuously monitored, and many machine learning (ML) solutions have been proposed to predict specific outcomes like death, bleeding, or organ failure. Forecasting of vital parameters is a more general approach to ML-based patient monitoring, but the literature on its feasibility and robust benchmarks of achievable accuracy are scarce. We implemented five univariate statistical models (the naïve model, the Theta method, exponential smoothing, the autoregressive integrated moving average model, and an autoregressive single-layer neural network), two univariate neural networks (N-BEATS and N-HiTS), and two multivariate neural networks designed for sequential data (a recurrent neural network with gated recurrent unit, GRU, and a Transformer network) to produce forecasts for six vital parameters recorded at five-minute intervals during intensive care monitoring. Vital parameters were the diastolic, systolic, and mean arterial blood pressure, central venous pressure, peripheral oxygen saturation (measured by non-invasive pulse oximetry) and heart rate, and forecasts were made for 5 through 120 minutes into the future. Patients used in this study recovered from cardiothoracic surgery in an ICU. The patient cohort used for model development (n = 22,348) and internal testing (n = 2,483) originated from a heart center in Germany, while a patient subset from the eICU collaborative research database, an American multicenter ICU cohort, was used for external testing (n = 7,477). The GRU was the predominant method in this study. Uni- and multivariate neural network models proved to be superior to univariate statistical models across vital parameters and forecast horizons, and their advantage steadily became more pronounced for increasing forecast horizons. With this study, we established an extensive set of benchmarks for forecast performance in the ICU. Our findings suggest that supplying physicians with short-term forecasts of vital parameters in the ICU is feasible, and that multivariate neural networks are most suited for the task due to their ability to learn patterns across thousands of patients.
Publications1 - 10 of 13