Alizée Pace
Loading...
13 results
Filters
Reset filtersSearch Results
Publications 1 - 10 of 13
- Clinical Trajectory Representations for ClusteringItem type: Conference Paper
ICLR 2023 Workshop on Time Series Representation Learning for HealthLi, Haobo; Pace, Alizée; Faltys, Martin; et al. (2023)Analyzing and grouping typical patient trajectories is crucial to understanding their health state, estimating prognosis, and determining optimal treatment. The increasing availability of electronic health records (EHRs) opens the opportunity to support clinicians in their decisions with machine learning solutions. We propose the Multi-scale Health-state Variational Auto-Encoder (MHealthVAE) to learn medically informative patient representations and allow meaningful subgroup detection from sparse EHRs. We derive a novel training objective to better capture health information and temporal trends into patient embeddings and introduce new performance metrics to evaluate the clinical relevance of patient clustering results. - Temporal Label Smoothing for Early Prediction of Adverse EventsItem type: Working Paper
arXivYèche, Hugo; Pace, Alizée; Rätsch, Gunnar; et al. (2022)Models that can predict adverse events ahead of time with low false-alarm rates are critical to the acceptance of decision support systems in the medical community. This challenging machine learning task remains typically treated as simple binary classification, with few bespoke methods proposed to leverage temporal dependency across samples. We propose Temporal Label Smoothing (TLS), a novel learning strategy that modulates smoothing strength as a function of proximity to the event of interest. This regularization technique reduces model confidence at the class boundary, where the signal is often noisy or uninformative, thus allowing training to focus on clinically informative data points away from this boundary region. From a theoretical perspective, we also show that our method can be framed as an extension of multi-horizon prediction, a learning heuristic proposed in other early prediction work. TLS empirically matches or outperforms considered competing methods on various early prediction benchmark tasks. In particular, our approach significantly improves performance on clinically-relevant metrics such as event recall at low false-alarm rates. - Predictions, Policies, Rewards: Models of Decision-Making from Observational DataItem type: Doctoral ThesisPace, Alizée (2025)While reinforcement learning has achieved success in solving well-defined decision-making problems, its application to optimizing complex human decisions remains a challenge. A promising use case would be in healthcare, where data-driven models could support the process of diagnosis or treatment. Modeling such decision-making problems is difficult due to the inherent complexity of real-world data, the high stakes of each decision's potential outcomes, and the ill-defined objectives of the tasks considered. In this thesis, we formalize and address these challenges to learning and optimizing models of decision-making. We structure our focus around three interdependent modeling paradigms: prediction, policy, and reward models. First, we propose to improve prediction models of real-world environments, specifically focusing on patient trajectories in electronic health records. Deep learning architectures still perform poorly on clinical time-series data, due to high variation across feature types and sampling rates. We address these issues by leveraging the semantic heterogeneity and temporal structure of the data. This results in novel model architectures and objective functions that improve the performance of predictive models for this data modality. Next, we explore how to derive policy models, describing what action to take in a given situation. The major challenge is to learn without direct environment interaction. Offline reinforcement learning and imitation learning are two frameworks for learning decision policies from observational data. We leverage these to obtain actionable policies that could be deployed for decision support – prioritizing reliability and interpretability. To ensure end-user adoption, effective policy models for such high-stakes applications must be robust to causal biases present in the data, and transparent in explaining the decision-making process. We design methods to achieve this and validate them on real and simulated medical tasks. Finally, we consider the task of designing reward functions aligned with human objectives. In healthcare, desirable outcomes could represent patient survival, quality-adjusted life years, or the prevention of specific adverse events. Rather than manually formalizing such complex, multifaceted objectives, we focus on learning reward models based on human feedback. As this data may be expensive to collect, we develop methods that maximize the sample efficiency of the learning process by generating simulated trajectories and synthetic preferences – always in a fully observational setting. Our approach allows for general and scalable applications, including reward learning for language model alignment. Motivated by healthcare but broadly applicable across domains, this thesis addresses fundamental challenges in learning models of human decision-making. It takes a step towards advancing the development of safe and effective decision support systems, helping to bridge the gap between machine learning research and real-world impact.
- A comprehensive ML-based Respiratory Monitoring System for Physiological Monitoring & Resource Planning in the ICUItem type: Working Paper
medRxivHüser, Matthias; Lyu, Xinrui; Faltys, Martin; et al. (2024)Respiratory failure (RF) is a frequent occurrence in critically ill patients and is associated with significant morbidity and mortality as well as resource use. To improve the monitoring and management of RF in intensive care unit (ICU) patients, we used machine learning to develop a monitoring system covering the entire management cycle of RF, from early detection and monitoring, to assessment of readiness for extubation and prediction of extubation failure risk. For patients in the ICU in the study cohort, the system predicts 80% of RF events at a precision of 45% with 65% identified 10h before the onset of an RF event. This significantly improves upon a standard clinical baseline based on the SpO2/FiO2 ratio. After a careful analysis of ICU differences, the RF alarm system was externally validated showing similar performance for patients in the external validation cohort. Our system also provides a risk score for extubation failure for patients who are clinically ready to extubate, and we illustrate how such a risk score could be used to extubate patients earlier in certain scenarios. Moreover, we demonstrate that our system, which closely monitors respiratory failure, ventilation need, and extubation readiness for individual patients can also be used for ICU-level ventilator resource planning. In particular, we predict ventilator use 8-16h into the future, corresponding to the next ICU shift, with a mean absolute error of 0.4 ventilators per 10 patients effective ICU capacity. - Temporal Label Smoothing for Early Event PredictionItem type: Conference Paper
Proceedings of Machine Learning Research ~ Proceedings of the 40th International Conference on Machine LearningYèche, Hugo; Pace, Alizée; Rätsch, Gunnar; et al. (2023)Models that can predict the occurrence of events ahead of time with low false-alarm rates are critical to the acceptance of decision support systems in the medical community. This challenging task is typically treated as a simple binary classification, ignoring temporal dependencies between samples, whereas we propose to exploit this structure. We first introduce a common theoretical framework unifying dynamic survival analysis and early event prediction. Following an analysis of objectives from both fields, we propose Temporal Label Smoothing (TLS), a simpler, yet best-performing method that preserves prediction monotonicity over time. By focusing the objective on areas with a stronger predictive signal, TLS improves performance over all baselines on two large-scale benchmark tasks. Gains are particularly notable along clinically relevant measures, such as event recall at low false-alarm rates. TLS reduces the number of missed events by up to a factor of two over previously used approaches in early event prediction. - POETREE: Interpretable Policy Learning with Adaptive Decision TreesItem type: Conference Paper
The Tenth International Conference on Learning Representations (ICLR 2022)Pace, Alizée; Chan, Alex; van der Schaar, Mihaela (2022)Building models of human decision-making from observed behaviour is critical to better understand, diagnose and support real-world policies such as clinical care. As established policy learning approaches remain focused on imitation performance, they fall short of explaining the demonstrated decision-making process. Policy Extraction through decision Trees (POETREE) is a novel framework for interpretable policy learning, compatible with fully-offline and partially-observable clinical decision environments -- and builds probabilistic tree policies determining physician actions based on patients' observations and medical history. Fully-differentiable tree architectures are grown incrementally during optimization to adapt their complexity to the modelling task, and learn a representation of patient history through recurrence, resulting in decision tree policies that adapt over time with patient information. This policy learning method outperforms the state-of-the-art on real and synthetic medical datasets, both in terms of understanding, quantifying and evaluating observed behaviour as well as in accurately replicating it -- with potential to improve future decision support systems. - On the Importance of Step-wise Embeddings for Heterogeneous Clinical Time-SeriesItem type: Conference Paper
Proceedings of Machine Learning Research ~ Proceedings of the 3rd Machine Learning for Health SymposiumKuznetsova, Rita; Pace, Alizée; Burger, Manuel; et al. (2023)Recent advances in deep learning architectures for sequence modeling have not fully transferred to tasks handling time-series from electronic health records. In particular, in problems related to the Intensive Care Unit (ICU), the state-of-the-art remains to tackle sequence classification in a tabular manner with tree-based methods. Recent findings in deep learning for tabular data are now surpassing these classical methods by better handling the severe heterogeneity of data input features. Given the similar level of feature heterogeneity exhibited by ICU time-series and motivated by these findings, we explore these novel methods impact on clinical sequence modeling tasks. By jointly using such advances in deep learning for tabular data, our primary objective is to underscore the importance of step-wise embeddings in time-series modeling, which remain unexplored in machine learning methods for clinical data. On a variety of clinically relevant tasks from two large-scale ICU datasets, MIMIC-III and HiRID, our work provides an exhaustive analysis of state-of-the-art methods for tabular time-series as time-step embedding models, showing overall performance improvement. In particular, we evidence the importance of feature grouping in clinical time-series, with significant performance gains when considering features within predefined semantic groups in the step-wise embedding module. - Delphic Offline Reinforcement Learning under Nonidentifiable Hidden ConfoundingItem type: Working Paper
arXivPace, Alizée; Yèche, Hugo; Schölkopf, Bernhard; et al. (2023)A prominent challenge of offline reinforcement learning (RL) is the issue of hidden confounding: unobserved variables may influence both the actions taken by the agent and the observed outcomes. Hidden confounding can compromise the validity of any causal conclusion drawn from data and presents a major obstacle to effective offline RL. In the present paper, we tackle the problem of hidden confounding in the nonidentifiable setting. We propose a definition of uncertainty due to hidden confounding bias, termed delphic uncertainty, which uses variation over world models compatible with the observations, and differentiate it from the well-known epistemic and aleatoric uncertainties. We derive a practical method for estimating the three types of uncertainties, and construct a pessimistic offline RL algorithm to account for them. Our method does not assume identifiability of the unobserved confounders, and attempts to reduce the amount of confounding bias. We demonstrate through extensive experiments and ablations the efficacy of our approach on a sepsis management benchmark, as well as on electronic health records. Our results suggest that nonidentifiable hidden confounding bias can be mitigated to improve offline RL solutions in practice. - Uncertainty-Penalized Direct Preference OptimizationItem type: Conference Paper
NeurIPS 2024 Workshop on Fine-Tuning in Modern Machine Learning: Principles and ScalabilityHouliston, Sam; Pace, Alizée; Immer, Alexander; et al. (2024)Aligning Large Language Models (LLMs) to human preferences in content, style, and presentation is challenging, in part because preferences are varied, context-dependent, and sometimes inherently ambiguous. While successful, Reinforcement Learning from Human Feedback (RLHF) and Direct Preference Optimization (DPO) are prone to the issue of proxy reward overoptimization. Analysis of the DPO loss reveals a critical need for regularization for mislabeled or ambiguous preference pairs to avoid reward hacking. In this work, we develop a pessimistic framework for DPO by introducing preference uncertainty penalization schemes, inspired by offline reinforcement learning. The penalization serves as a correction to the loss which attenuates the loss gradient for uncertain samples. Evaluation of the methods is performed with GPT2 Medium on the Anthropic-HH dataset using a model ensemble to obtain uncertainty estimates, and shows improved overall performance compared to vanilla DPO, as well as better completions on prompts from high-uncertainty chosen/rejected responses. - Reinforcement Learning for Heart Failure Treatment Optimization in the Intensive Care UnitItem type: Conference Paper
2024 46th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC)Drudi, Cristian; Fechner, Moritz; Mollura, Maximiliano; et al. (2024)Heart failure (HF) is a major public health issue. Despite improvements in treatment, mortality rates among HF patients remain high, especially for those in the intensive care unit (ICU) who experience the highest in-hospital mortality rates.Clinical guidelines for the treatment of HF provide general recommendations, that however often lack strong evidence derived from randomized controlled trials (RCTs). Furthermore, they can only provide general guidance and fail to determine personalized strategies.Previous literature has shown that reinforcement learning (RL) is effective in determining optimal treatment recommendations in critical care settings. In this study, we used RL to address uncertainty in the administration of vasopressors and diuretics while considering individual patient characteristics. We utilized data from the MIMIC-IV database to demonstrate the potential of RL in improving treatment strategies for HF.The study indicates that RL achieved a significant mortality reduction of ≈ 20%. However, further research is necessary due to the lack of external validation and limitations in policy evaluation.Clinical relevance—This study adds to the growing body of evidence that demonstrates the potential of RL in identifying optimal treatment strategies in critical care settings. Specifically, the policy estimated by RL reduced mortality rates of HF patients in the ICU by ≈20% compared to the observed clinician policy.
Publications 1 - 10 of 13