Samuel Ruiperez Campillo


Loading...

Last Name

Ruiperez Campillo

First Name

Samuel

Organisational unit

09670 - Vogt, Julia / Vogt, Julia

Search Results

Publications 1 - 10 of 35
  • Ors-Quixal, R. Teodoro; Ruiperez Campillo, Samuel; Castells-Ramón, Francisco; et al. (2024)
    2024 46th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC)
    Standard diagnostic methods for evaluating the severity of brain injuries resulting from cardiac arrest, such as the Glasgow Coma Scale, exhibit subjective biases that lead to potentially fatal misclassifications, where life-support systems are prematurely withdrawn from patients who might otherwise recover. This study utilizes an open dataset from the International Cardiac Arrest Research Consortium to develop and evaluate a 3D convolutional neural network (CNN) model for classifying outcomes in comatose patients after cardiac arrest. The electroencephalographic (EEG) signals from the dataset are preprocessed by resampling, filtering, and standardizing signal length (10 seconds) and channel count. The model’s architecture comprises 3D convolutional neural networks with subsequent layers for vectorization, compression, and further automatic feature extraction. Evaluation metrics focus on the area under the receiver operating characteristic curve, confusion matrix, accuracy, and F1 score. Results show that the 3D-CNN model outperforms existing 2D-CNN models in classifying outcomes for comatose patients, exhibiting a higher area under the receiver operating characteristic curve.
  • Feng, Ruibin; Brennan, Kelly A.; Azizi, Zahra; et al. (2025)
    Circulation: Arrhythmia and Electrophysiology
    BACKGROUND: Large language models (LLMs) such as Chat Generative Pre-trained Transformer (ChatGPT) excel at interpreting unstructured data from public sources, yet are limited when responding to queries on private repositories, such as electronic health records (EHRs). We hypothesized that prompt engineering could enhance the accuracy of LLMs for interpreting EHR data without requiring domain knowledge, thus expanding their utility for patients and personalized diagnostics. METHODS: We designed and systematically tested prompt engineering techniques to improve the ability of LLMs to interpret EHRs for nuanced diagnostic questions, referenced to a panel of medical experts. In 490 full-text EHR notes from 125 patients with prior life-threatening heart rhythm disorders, we asked GPT-4-turbo to identify recurrent arrhythmias distinct from prior events and tested 220 563 queries. To provide context, results were compared with rule-based natural language processing and Bidirectional Encoder Representations from Transformer-based language models. Experiments were repeated for 2 additional LLMs. RESULTS: In an independent hold-out set of 389 notes, GPT-4-turbo had a balanced accuracy of 64.3%±4.7% out-of-the-box at baseline. This increased when asking GPT-4-turbo to provide a rationale for its answers, a structured data output, and in-context exemplars, to a balanced accuracy of 91.4%±3.8% (P<0.05). This surpassed the traditional logic-based natural language processing and BERT-based models (P<0.05). Results were consistent for GPT-3.5-turbo and Jurassic-2 LLMs. CONCLUSIONS: The use of prompt engineering strategies enables LLMs to identify clinical end points from EHRs with an accuracy that surpassed natural language processing and approximated experts, yet without the need for expert knowledge. These approaches could be applied to LLM queries for other domains, to facilitate automated analysis of nuanced data sets with high accuracy by nonexperts.
  • Agostini, Andrea; Laguna Cillero, Sonia; Ryser, Alain; et al. (2025)
    ICML 2025 Workshop on Multi-modal Foundation Models and Large Language Models for Life Sciences
    Building generalizable medical AI systems requires pretraining strategies that are data-efficient and domain-aware. Unlike internet-scale corpora, clinical datasets such as MIMIC-CXR offer limited image counts and scarce annotations, but exhibit rich internal structure through multi-view imaging. We propose a self-supervised framework that leverages the inherent structure of medical datasets. Specifically, we treat paired chest X-rays (i.e., frontal and lateral views) as natural positive pairs, learning to reconstruct each view from sparse patches while aligning their latent embeddings. Our method requires no textual supervision and produces informative representations. Evaluated on MIMIC-CXR, we show strong performance compared to supervised objectives and baselines being trained without leveraging structure. This work provides a lightweight, modality-agnostic blueprint for domain-specific pretraining where data is structured but scarce.
  • Chen, Boqi; Vincent-Cuaz, Cédric; Schoenpflug, Lydia A.; et al. (2026)
    Lecture Notes in Computer Science ~ Medical Image Computing and Computer Assisted Intervention – MICCAI 2025: 28th International Conference, Daejeon, South Korea, September 23–27, 2025, Proceedings, Part VI
    Vision foundation models (FMs) are accelerating the development of digital pathology algorithms and transforming biomedical research. These models learn, in a self-supervised manner, to represent histological features in highly heterogeneous tiles extracted from whole-slide images (WSIs) of real-world patient samples. The performance of these FMs is significantly influenced by the size, diversity, and balance of the pre-training data. However, data selection has been primarily guided by expert knowledge at the WSI level, focusing on factors such as disease classification and tissue types, while largely overlooking the granular details available at the tile level. In this paper, we investigate the potential of unsupervised automatic data curation at the tile-level, taking into account 350 million tiles. Specifically, we apply hierarchical clustering trees to pre-extracted tile embeddings, allowing us to sample balanced datasets uniformly across the embedding space of the pretrained FM. We further identify these datasets are subject to a trade-off between size and balance, potentially compromising the quality of representations learned by FMs, and propose tailored batch sampling strategies to mitigate this effect. We demonstrate the effectiveness of our method through improved performance on a diverse range of clinically relevant downstream tasks.
  • Ruiperez Campillo, Samuel; Reiss, Michael; Ramírez, Elisa; et al. (2024)
    Biocybernetics and Biomedical Engineering
    Background and motivation: The application of artificial intelligence in medical research, particularly unsupervised learning techniques, has shown promising potential. Medical time series data poses a unique challenge for analysis due to its complexity. Existing unsupervised learning methods often fail to effectively classify these variations, highlighting a gap in current approaches. We introduce a methodological clustering classification framework designed to accurately handle such data, aiming for improved classification tasks in biomedical signals. Methods: To address these challenges, we introduce a novel approach for the analysis and classification of medical time series data. Our method integrates agglomerative hierarchical clustering with Hilbert vector space representations of medical signals and biological sequences. We rigorously define the mathematical principles and conduct evaluations using simulations of cardiac signals, real-world neural signal datasets, open-source protein sequences, and the MNIST dataset for illustrative purposes. Results: The proposed method exhibited a 96% success rate in classifying protein sequences by function and effectively identifying families within a large protein set. In cardiac signal analysis, it retained 0.996 variance in a condensed 6-dimensional space, accurately classifying 87.4% of simulated atrial flutter groups and 99.91% of main groups when excluding conduction direction. For neural signals, it demonstrated near-perfect tracking accuracy of neural activity in mouse brain recordings, as confirmed by expert evaluations. Conclusion: Our proposed method offers a novel, translational approach for the treatment and classification of medical and biological time series, addressing some of the prevalent challenges in the field and paving the way for more reliable and effective biomedical signal analysis.
  • Crespo, Marina; Ruiperez Campillo, Samuel; Casado-Arroyo, Ruben; et al. (2023)
    2023 45th Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC)
    The development of high-density multielectrode catheters has significantly advanced cardiac electrophysiology mapping. High-density grid catheters have enabled the creation of a novel technique for reconstructing electrogram (EGM) signals known as "omnipole," which is believed to be more reliable than other methods, especially in terms of orientation independence. This study aims to evaluate how distance affects the omnipolar reconstruction of EGMs by comparing different configurations. Using an animal set up of perfused isolated rabbit hearts, recordings were taken using an ad hoc high-density epicardial multielectrode catheter. Inter-electrode distances ranging from 1 to 4 mm were analysed for their effect on the quality of resulting EGMs. Two biomarkers were computed to evaluate the robustness of the reconstructions: the areas contained within the bipolar loops and the amplitudes of the omnipoles. We hypothesised that both bipolar and omnipolar electrograms would be more robust at shorter inter-electrode distances. The results showed that an increase in distance triggers an increase in loop areas and amplitudes, which supports the hypothesis. This finding provides a more reliable estimate of wavefront propagation for the cross-omnipolar reconstruction method. These results emphasise the importance of distance in cardiac electrophysiology mapping and provide valuable insights into the use of high-density multielectrode catheters for EGM reconstruction.
  • Segarra, Izan; Cebrián, Antonio; Ruiperez Campillo, Samuel; et al. (2023)
    2023 45th Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC)
    The present study aims to design and fabricate a system capable of generating heterogeneities on the epicardial surface of an isolated rabbit heart perfused in a Langendorff system. The system consists of thermoelectric modules that can be independently controlled by the developed hardware, thereby allowing for the generation of temperature gradients on the epicardial surface, resulting in conduction slowing akin to heterogeneities of pathological origin. A comprehensive analysis of the system's viability was performed through modeling and thermal simulation, and its practicality was validated through preliminary tests conducted at the experimental cardiac electrophysiology laboratory of the University of Valencia. The design process involved the use of Fusion 360 for 3D designs, MATLAB/Simulink for algorithms and block diagrams, LTSpice and Altium Designer for schematic captures and PCB design, and the integration of specialized equipment for animal experimentation. The objective of the study was to efficiently capture epicardial recordings under varying conditions.
  • Pantelidis, Panteleimon; Dilaveris, Polychronis; Ruiperez Campillo, Samuel; et al. (2025)
    Biomedicines
    Artificial intelligence (AI) is transforming cardiovascular medicine by enabling the analysis of high-dimensional biomedical data with unprecedented precision. Initially employed to automate human tasks such as electrocardiogram (ECG) interpretation and imaging segmentation, AI's true potential lies in uncovering hidden disease data patterns, predicting long-term cardiovascular risk, and personalizing treatments. Unlike human cognition, which excels in certain tasks but is limited by memory and processing constraints, AI integrates multimodal data sources-including ECG, echocardiography, cardiac magnetic resonance (CMR) imaging, genomics, and wearable sensor data-to generate novel clinical insights. AI models have demonstrated remarkable success in early dis-ease detection, such as predicting heart failure from standard ECGs before symptom on-set, distinguishing genetic cardiomyopathies, and forecasting arrhythmic events. However, several challenges persist, including AI's lack of contextual understanding in most of these tasks, its "black-box" nature, and biases in training datasets that may contribute to disparities in healthcare delivery. Ethical considerations and regulatory frameworks are evolving, with governing bodies establishing guidelines for AI-driven medical applications. To fully harness the potential of AI, interdisciplinary collaboration among clinicians, data scientists, and engineers is essential, alongside open science initiatives to promote data accessibility and reproducibility. Future AI models must go beyond task automation, focusing instead on augmenting human expertise to enable proactive, precision-driven cardiovascular care. By embracing AI's computational strengths while addressing its limitations, cardiology is poised to enter an era of transformative innovation beyond traditional diagnostic and therapeutic paradigms.
  • Kolk, Maarten Z.H.; Ruiperez Campillo, Samuel; Alvarez Florez, Laura; et al. (2024)
    eBioMedicine
    Background: Risk stratification for ventricular arrhythmias currently relies on static measurements that fail to adequately capture dynamic interactions between arrhythmic substrate and triggers over time. We trained and internally validated a dynamic machine learning (ML) model and neural network that extracted features from longitudinally collected electrocardiograms (ECG), and used these to predict the risk of malignant ventricular arrhythmias. Methods: A multicentre study in patients implanted with an implantable cardioverter-defibrillator (ICD) between 2007 and 2021 in two academic hospitals was performed. Variational autoencoders (VAEs), which combine neural networks with variational inference principles, and can learn patterns and structure in data without explicit labelling, were trained to encode the mean ECG waveforms from the limb leads into 16 variables. Supervised dynamic ML models using these latent ECG representations and clinical baseline information were trained to predict malignant ventricular arrhythmias treated by the ICD. Model performance was evaluated on a hold-out set, using time-dependent receiver operating characteristic (ROC) and calibration curves. Findings: 2942 patients (61.7 ± 13.9 years, 25.5% female) were included, with a total of 32,129 ECG recordings during a mean follow-up of 43.9 ± 35.9 months. The mean time-varying area under the ROC curve for the dynamic model was 0.738 ± 0.07, compared to 0.639 ± 0.03 for a static (i.e. baseline-only model). Feature analyses indicated dynamic changes in latent ECG representations, particularly those affecting the T-wave morphology, were of highest importance for model predictions. Interpretation: Dynamic ML models and neural networks effectively leverage routinely collected longitudinal ECG recordings for personalised and updated predictions of malignant ventricular arrhythmias, outperforming static models. Funding: This publication is part of the project DEEP RISK ICD (with project number 452019308) of the research programme Rubicon which is (partly) financed by the Dutch Research Council (NWO). This research is partly funded by the Amsterdam Cardiovascular Sciences (personal grant F.V.Y.T).
  • Kolk, Maarten Z.H.; Ruiperez Campillo, Samuel; Deb, Brototo; et al. (2023)
    EP Europace
    Aims Left ventricular ejection fraction (LVEF) is suboptimal as a sole marker for predicting sudden cardiac death (SCD). Machine learning (ML) provides new opportunities for personalized predictions using complex, multimodal data. This study aimed to determine if risk stratification for implantable cardioverter-defibrillator (ICD) implantation can be improved by ML models that combine clinical variables with 12-lead electrocardiograms (ECG) time-series features. Methods and results A multicentre study of 1010 patients (64.9 ± 10.8 years, 26.8% female) with ischaemic, dilated, or non-ischaemic cardiomyopathy, and LVEF ≤ 35% implanted with an ICD between 2007 and 2021 for primary prevention of SCD in two academic hospitals was performed. For each patient, a raw 12-lead, 10-s ECG was obtained within 90 days before ICD implantation, and clinical details were collected. Supervised ML models were trained and validated on a development cohort (n = 550) from Hospital A to predict ICD non-arrhythmic mortality at three-year follow-up (i.e. mortality without prior appropriate ICD-therapy). Model performance was evaluated on an external patient cohort from Hospital B (n = 460). At three-year follow-up, 16.0% of patients had died, with 72.8% meeting criteria for non-arrhythmic mortality. Extreme gradient boosting models identified patients with non-arrhythmic mortality with an area under the receiver operating characteristic curve (AUROC) of 0.90 [95% confidence intervals (CI) 0.80–1.00] during internal validation. In the external cohort, the AUROC was 0.79 (95% CI 0.75–0.84). Conclusions ML models combining ECG time-series features and clinical variables were able to predict non-arrhythmic mortality within three years after device implantation in a primary prevention population, with robust performance in an independent cohort.
Publications 1 - 10 of 35