Alexander Marx


Loading...

Last Name

Marx

First Name

Alexander

Organisational unit

Search Results

Publications 1 - 10 of 17
  • Leutheuser, Heike; Bartholet, Marc; Marx, Alexander; et al. (2024)
    Frontiers in Medicine
    Children with type 1 diabetes (T1D) frequently have nocturnal hypoglycemia, daytime physical activity being the most important risk factor. The risk for late post-exercise hypoglycemia depends on various factors and is difficult to anticipate. The availability of continuous glucose monitoring (CGM) enabled the development of various machine learning approaches for nocturnal hypoglycemia prediction for different prediction horizons. Studies focusing on nocturnal hypoglycemia prediction in children are scarce, and none, to the best knowledge of the authors, investigate the effect of previous physical activity. The primary objective of this work was to assess the risk of hypoglycemia throughout the night (prediction horizon 9 h) associated with physical activity in children with T1D using data from a structured setting. Continuous glucose and physiological data from a sports day camp for children with T1D were input for logistic regression, random forest, and deep neural network models. Results were evaluated using the F2 score, adding more weight to misclassifications as false negatives. Data of 13 children (4 female, mean age 11.3 years) were analyzed. Nocturnal hypoglycemia occurred in 18 of a total included 66 nights. Random forest using only glucose data achieved a sensitivity of 71.1% and a specificity of 75.8% for nocturnal hypoglycemia prediction. Predicting the risk of nocturnal hypoglycemia for the upcoming night at bedtime is clinically highly relevant, as it allows appropriate actions to be taken-to lighten the burden for children with T1D and their families.
  • Czyż, Paweł; Grabowski, Frederic; Vogt, Julia E.; et al. (2023)
    arXiv
    Mutual information quantifies the dependence between two random variables and remains invariant under diffeomorphisms. In this paper, we explore the pointwise mutual information profile, an extension of mutual information that maintains this invariance. We analytically describe the profiles of multivariate normal distributions and introduce the family of fine distributions, for which the profile can be accurately approximated using Monte Carlo methods. We then show how fine distributions can be used to study the limitations of existing mutual information estimators, investigate the behavior of neural critics used in variational estimators, and understand the effect of experimental outliers on mutual information estimation. Finally, we show how fine distributions can be used to obtain model-based Bayesian estimates of mutual information, suitable for problems with available domain expertise in which uncertainty quantification is necessary.
  • Ryser, Alain; Sutter, Thomas M.; Marx, Alexander; et al. (2025)
    Transactions on Machine Learning Research
    Anomaly detection focuses on identifying samples that deviate from the norm. Discovering informative representations of normal samples is crucial to detecting anomalies effectively. Recent self-supervised methods have successfully learned such representations by employing prior knowledge about anomalies to create synthetic outliers during training. However, we often do not know what to expect from unseen data in specialized real-world applications. In this work, we address this limitation with our new approach, Con2, which leverages prior knowledge about symmetries in normal samples to observe the data in different contexts. Con2 consists of two parts: Context Contrasting clusters representations according to their context, while Content Alignment encourages the model to capture semantic information by aligning the positions of normal samples across clusters. The resulting representation space allows us to detect anomalies as outliers of the learned context clusters. We demonstrate the benefit of this approach in extensive experiments on specialized medical datasets, outperforming competitive baselines based on self-supervised learning and pretrained models and presenting competitive performance on natural imaging benchmarks.
  • Ryser, Alain; Sutter, Thomas M.; Marx, Alexander; et al. (2024)
    NeurIPS 2024 Workshop: Self-Supervised Learning - Theory and Practice
    Anomaly detection focuses on identifying samples that deviate from the norm. When working with high-dimensional data such as images, a crucial requirement for detecting anomalous patterns is learning lower-dimensional representations that capture concepts of normality. Recent advances in self-supervised learning have shown great promise in this regard. However, many successful self-supervised anomaly detection methods assume prior knowledge about anomalies to create synthetic outliers during training. Yet, in real-world applications, we often do not know what to expect from unseen data, and we can solely leverage knowledge about normal data. In this work, we propose Con , which learns representations through context augmentations that model invariances of normal data while letting us observe samples from two distinct perspectives. At test time, representations of anomalies that do not adhere to these invariances deviate from the representation structure learned during training, allowing us to detect anomalies without relying on prior knowledge about them.
  • Daunhawer, Imant; Bizeul, Alice; Palumbo, Emanuele; et al. (2023)
    The Eleventh International Conference on Learning Representations (ICLR 2023)
    Contrastive learning is a cornerstone underlying recent progress in multi-view and multimodal learning, e.g., in representation learning with image/caption pairs. While its effectiveness is not yet fully understood, a line of recent work reveals that contrastive learning can invert the data generating process and recover ground truth latent factors shared between views. In this work, we present new identifiability results for multimodal contrastive learning, showing that it is possible to recover shared factors in a more general setup than the multi-view setting studied previously. Specifically, we distinguish between the multi-view setting with one generative mechanism (e.g., multiple cameras of the same type) and the multimodal setting that is characterized by distinct mechanisms (e.g., cameras and microphones). Our work generalizes previous identifiability results by redefining the generative process in terms of distinct mechanisms with modality-specific latent variables. We prove that contrastive learning can block-identify latent factors shared between modalities, even when there are nontrivial dependencies between factors. We empirically verify our identifiability results with numerical simulations and corroborate our findings on a complex multimodal dataset of image/text pairs. Zooming out, our work provides a theoretical basis for multimodal representation learning and explains in which settings multimodal contrastive learning can be effective in practice.
  • Marx, Alexander; Di Stefano, Francesco; Leutheuser, Heike; et al. (2023)
    Frontiers in Pediatrics
    Background: The overarching goal of blood glucose forecasting is to assist individuals with type 1 diabetes (T1D) in avoiding hyper- or hypoglycemic conditions. While deep learning approaches have shown promising results for blood glucose forecasting in adults with T1D, it is not known if these results generalize to children. Possible reasons are physical activity (PA), which is often unplanned in children, as well as age and development of a child, which both have an effect on the blood glucose level. Materials and Methods: In this study, we collected time series measurements of glucose levels, carbohydrate intake, insulin-dosing and physical activity from children with T1D for one week in an ethics approved prospective observational study, which included daily physical activities. We investigate the performance of state-of-the-art deep learning methods for adult data—(dilated) recurrent neural networks and a transformer—on our dataset for short-term (30 min) and long-term (2 h) prediction. We propose to integrate static patient characteristics, such as age, gender, BMI, and percentage of basal insulin, to account for the heterogeneity of our study group. Results: Integrating static patient characteristics (SPC) proves beneficial, especially for short-term prediction. LSTMs and GRUs with SPC perform best for a prediction horizon of 30  min (RMSE of 1.66 mmol/l), a vanilla RNN with SPC performs best across different prediction horizons, while the performance significantly decays for long-term prediction. For prediction during the night, the best method improves to an RMSE of 1.50 mmol/l. Overall, the results for our baselines and RNN models indicate that blood glucose forecasting for children conducting regular physical activity is more challenging than for previously studied adult data. Conclusion: We find that integrating static data improves the performance of deep-learning architectures for blood glucose forecasting of children with T1D and achieves promising results for short-term prediction. Despite these improvements, additional clinical studies are warranted to extend forecasting to longer-term prediction horizons.
  • Immer, Alexander; Schultheiss, Christoph; Vogt, Julia E.; et al. (2023)
    Proceedings of Machine Learning Research ~ Proceedings of the 40th International Conference on Machine Learning
    We study the class of location-scale or heteroscedastic noise models (LSNMs), in which the effect Y can be written as a function of the cause X and a noise source N independent of X, which may be scaled by a positive function g over the cause, i.e., Y=f(X)+g(X)N. Despite the generality of the model class, we show the causal direction is identifiable up to some pathological cases. To empirically validate these theoretical findings, we propose two estimators for LSNMs: an estimator based on (non-linear) feature maps, and one based on neural networks. Both model the conditional distribution of Y given X as a Gaussian parameterized by its natural parameters. When the feature maps are correctly specified, we prove that our estimator is jointly concave, and a consistent estimator for the cause-effect identification task. Although the the neural network does not inherit those guarantees, it can fit functions of arbitrary complexity, and reaches state-of-the-art performance across benchmarks.
  • Ryser, Alain; Sutter, Thomas; Marx, Alexander; et al. (2024)
    NeurIPS 2024 Workshop on Symmetry and Geometry in Neural Representations
    Anomaly detection focuses on identifying samples that deviate from the norm. When working with high-dimensional data such as images, a crucial requirement for detecting anomalous patterns is learning lower-dimensional representations that capture concepts of normality. Recent advances in self-supervised learning have shown great promise in this regard. However, many successful self-supervised anomaly detection methods assume prior knowledge about anomalies to create synthetic outliers during training. Yet, in real-world applications, we often do not know what to expect from unseen data, and we can solely leverage knowledge about normal data. In this work, we propose Con , which learns representations through context augmentations that model invariances of normal data while letting us observe samples from two distinct perspectives. At test time, representations of anomalies that do not adhere to these invariances deviate from the representation structure learned during training, allowing us to detect anomalies without relying on prior knowledge about them.
  • Mutti, Mirco; De Santi, Riccardo; Restelli, Marcello; et al. (2024)
    The Twelfth International Conference on Learning Representations
    Posterior sampling allows exploitation of prior knowledge on the environment's transition dynamics to improve the sample efficiency of reinforcement learning. The prior is typically specified as a class of parametric distributions, the design of which can be cumbersome in practice, often resulting in the choice of uninformative priors. In this work, we propose a novel posterior sampling approach in which the prior is given as a (partial) causal graph over the environment's variables. The latter is often more natural to design, such as listing known causal dependencies between biometric features in a medical treatment study. Specifically, we propose a hierarchical Bayesian procedure, called C-PSRL, simultaneously learning the full causal graph at the higher level and the parameters of the resulting factored dynamics at the lower level. We provide an analysis of the Bayesian regret of C-PSRL that explicitly connects the regret rate with the degree of prior knowledge. Our numerical evaluation conducted in illustrative domains confirms that C-PSRL strongly improves the efficiency of posterior sampling with an uninformative prior while performing close to posterior sampling with the full causal graph.
  • Xu, Sascha; Mian, Osman A.; Marx, Alexander; et al. (2022)
    Proceedings of Machine Learning Research ~ Proceedings of the 39th International Conference on Machine Learning
    We study the problem of identifying cause and effect over two univariate continuous variables X and Y from a sample of their joint distribution. Our focus lies on the setting when the variance of the noise may be dependent on the cause. We propose to partition the domain of the cause into multiple segments where the noise indeed is dependent. To this end, we minimize a scale-invariant, penalized regression score, finding the optimal partitioning using dynamic programming. We show under which conditions this allows us to identify the causal direction for the linear setting with heteroscedastic noise, for the non-linear setting with homoscedastic noise, as well as empirically confirm that these results generalize to the non-linear and heteroscedastic case. Altogether, the ability to model heteroscedasticity translates into an improved performance in telling cause from effect on a wide range of synthetic and real-world datasets.
Publications 1 - 10 of 17