Yanke Li


Loading...

Last Name

Li

First Name

Yanke

Organisational unit

03654 - Riener, Robert / Riener, Robert

Search Results

Publications 1 - 6 of 6
  • Li, Haixin; Li, Yanke; Paez-Granados, Diego (2025)
    We introduce KarmaTS, an interactive framework for constructing lag-indexed, executable spatiotemporal causal graphical models for multivariate time series (MTS) simulation. Motivated by the challenge of access-restricted physiological data, KarmaTS generates synthetic MTS with known causal dynamics and augments real-world datasets with expert knowledge. The system constructs a discrete-time structural causal process (DSCP) by combining expert knowledge and algorithmic proposals in a mixed-initiative, human-in-the-loop workflow. The resulting DSCP supports simulation and causal interventions, including those under user-specified distribution shifts. KarmaTS handles mixed variable types, contemporaneous and lagged edges, and modular edge functionals ranging from parameterizable templates to neural network models. Together, these features enable flexible validation and benchmarking of causal discovery algorithms through expert-informed simulation.
  • Fuchs, Bertram; Li, Yanke; Paez-Granados, Diego (2024)
  • Li, Yanke; Scheel-Sailer, Anke; Riener, Robert; et al. (2024)
    Research Square
    Developing machine learning (ML) methods for healthcare predictive modeling requires absolute explainability and transparency to build trust and accountability. Graphical models (GM) are key tools for this but face challenges like small sample sizes, mixed variables, and latent confounders. This paper presents a novel learning framework addressing these challenges by integrating latent variables using fast causal inference (FCI), accommodating mixed variables with predictive permutation conditional independence tests (PPCIT), and employing a systematic graphical embedding approach leveraging expert knowledge. This method ensures a transparent model structure and an explainable approach to feature selection and modeling, achieving competitive prediction performance. For real-world validation, data on hospital-acquired pressure injuries (HAPI) among individuals with spinal cord injuries (SCIs) were used, where the approach achieved a balanced accuracy of 0.941 and an AUC of 0.983, outperforming most benchmarks. The PPCIT method also demonstrated superior accuracy and scalability over other benchmarks in causal discovery validation on synthetic datasets that closely resemble our real dataset. This holistic framework effectively addresses the challenges of mixed variables and explainable predictive modeling for disease onset, which is crucial for enabling transparency and interpretability in ML-based healthcare.
  • Cisnal, Ana; Li, Yanke; Fuchs, Bertram; et al. (2024)
    IEEE Journal of Biomedical and Health Informatics
    Current blood pressure (BP) estimation methods have not achieved an accurate and adaptable approach for ambulatory diagnosis and monitoring applications of populations at risk of cardiovascular disease, generally due to a limited sample size. This paper introduces an algorithm for BP estimation solely reliant on photoplethysmography (PPG) signals and demographic features. It automatically obtains signal features and employs the Markov Blanket (MB) feature selection to discern informative and transmissible features, achieving a robust space adaptable to the population shift. This approach was validated with the Aurora-BP database, compromising ambulatory wearable cuffless BP measurements for over 500 individuals. After evaluating several machine-learning regression methods, Gradient Boosting emerged as the most effective. According to the MB feature selection, temporal, frequency, and demographic features ranked highest in importance, while statistical ones were deemed non-significant. A comparative assessment of a generic model (trained on unclassified BP data) and specialized models (tailored to each distinct BP population), demonstrated a consistent superiority of our proposed MB feature space with a mean absolute error of 10.2 mmHg (0.28) for systolic BP and 6.7 mmHg (0.18) for diastolic BP on the whole dataset. Moreover, we present a first comparison of in-clinic vs. ambulatory models, with performance significantly lower for the latter with a drop of 2.85 mmHg in systolic (p < 0.0001) and 2.82 mmHg for diastolic (p < 0.0001) estimation errors. This work contributes to the resilient understanding of BP estimation algorithms from PPG signals, providing causal features in the signal and quantifying the disparities between ambulatory and in-clinic measurements.
  • Cisnal, Ana; Li, Yanke; Fuchs, Bertram; et al. (2023)
    TechRxiv
    Current blood pressure (BP) estimation methods have not achieved an accurate and adaptable approach for application in populations at risk of cardiovascular disease, with generally limited sample sizes. Here, we introduce an algorithm for BP estimation solely reliant on photoplethysmography (PPG) signals and demographic features. Our approach automatically obtains signal features and employs the Markov Blanket (MB) feature selection to discern informative and transmissible features, achieving a robust space adaptable to the population shift. We validated our approach with the Aurora-BP database, compromising ambulatory wearable cuffless BP measurements for over 500 individuals. By evaluating several machine-learning regression methods, Gradient Boosting emerged as the most effective. The comparative assessment encompassed both a generic model (trained on unclassified BP data) and specialized models (tailored to each distinct BP population), with the former demonstrating consistent superiority with MAE of 10.2 mmHg (0.28) for systolic BP and 6.7 mmHg (0.18) for diastolic BP on the whole dataset. Moreover, a comparison of in-clinic and ambulatory model performance showed a significant decrease in accuracy for the latter of 2.85 mmHg in systolic (p < 0.0001, F-value = 32764.76) and 2.82 mmHg for diastolic (p < 0.0001, F-value = 65675.36) estimation errors. Our work contributes to a resilient BP estimation algorithm from PPG signals, underscoring the advantages of causal feature selection and quantifying the disparities between ambulatory and in-clinic measurements.
  • Li, Yanke; Scheel-Sailer, Anke; Riener, Robert; et al. (2024)
    Scientific Reports
    Developing machine learning (ML) methods for healthcare predictive modeling requires absolute explainability and transparency to build trust and accountability. Graphical models (GM) are key tools for this but face challenges like small sample sizes, mixed variables, and latent confounders. This paper presents a novel learning framework addressing these challenges by integrating latent variables using fast causal inference (FCI), accommodating mixed variables with predictive permutation conditional independence tests (PPCIT), and employing a systematic graphical embedding approach leveraging expert knowledge. This method ensures a transparent model structure and an explainable feature selection and modeling approach, achieving competitive prediction performance. For real-world validation, data of hospital-acquired pressure injuries (HAPI) among individuals with spinal cord injury (SCI) were used, where the approach achieved a balanced accuracy of 0.941 and an AUC of 0.983, outperforming most benchmarks. The PPCIT method also demonstrated superior accuracy and scalability over other benchmarks in causal discovery validation on synthetic datasets that closely resemble our real dataset. This holistic framework effectively addresses the challenges of mixed variables and explainable predictive modeling for disease onset, which is crucial for enabling transparency and interpretability in ML-based healthcare.
Publications 1 - 6 of 6