Journal: Proceedings of Machine Learning Research

Loading...

Abbreviation

Publisher

PMLR

Journal Volumes

ISSN

2640-3498

Description

Search Results

Publications 1 - 10 of 405
  • Donhauser, Konstantin; Ruggeri, Nicolò; Stojanovic, Stefan; et al. (2022)
    Proceedings of Machine Learning Research ~ Proceedings of the 39th International Conference on Machine Learning
    Good generalization performance on high-dimensional data crucially hinges on a simple structure of the ground truth and a corresponding strong inductive bias of the estimator. Even though this intuition is valid for regularized models, in this paper we caution against a strong inductive bias for interpolation in the presence of noise: While a stronger inductive bias encourages a simpler structure that is more aligned with the ground truth, it also increases the detrimental effect of noise. Specifically, for both linear regression and classification with a sparse ground truth, we prove that minimum lp-norm and maximum lp-margin interpolators achieve fast polynomial rates close to order 1/n for p > 1 compared to a logarithmic rate for p = 1. Finally, we provide preliminary experimental evidence that this trade-off may also play a crucial role in understanding non-linear interpolating models used in practice.
  • Curi, Sebastian; Bogunovic, Ilija; Krause, Andreas (2021)
    Proceedings of Machine Learning Research ~ Proceedings of the 38th International Conference on Machine Learning
    In real-world tasks, reinforcement learning (RL) agents frequently encounter situations that are not present during training time. To ensure reliable performance, the RL agents need to exhibit robustness against worst-case situations. The robust RL framework addresses this challenge via a worst-case optimization between an agent and an adversary. Previous robust RL algorithms are either sample inefficient, lack robustness guarantees, or do not scale to large problems. We propose the Robust Hallucinated Upper-Confidence RL (RH-UCRL) algorithm to provably solve this problem while attaining near-optimal sample complexity guarantees. RH-UCRL is a model-based reinforcement learning (MBRL) algorithm that effectively distinguishes between epistemic and aleatoric uncertainty, and efficiently explores both the agent and adversary decision spaces during policy learning. We scale RH-UCRL to complex tasks via neural networks ensemble models as well as neural network policies. Experimentally, we demonstrate that RH-UCRL outperforms other robust deep RL algorithms in a variety of adversarial environments.
  • Language Models as Science Tutors
    Item type: Conference Paper
    Chevalier, Alexis; Geng, Jiayi; Wettig, Alexander; et al. (2024)
    Proceedings of Machine Learning Research ~ Proceedings of the 41st International Conference on Machine Learning
    NLP has recently made exciting progress toward training language models (LMs) with strong scientific problem-solving skills. However, model development has not focused on real-life use-cases of LMs for science, including applications in education that require processing long scientific documents. To address this, we introduce TutorEval and TutorChat. TutorEval is a diverse question-answering benchmark consisting of questions about long chapters from STEM textbooks, written by experts. TutorEval helps measure real-life usability of LMs as scientific assistants, and it is the first benchmark combining long contexts, free-form generation, and multi-disciplinary scientific knowledge. Moreover, we show that fine-tuning base models with existing dialogue datasets leads to poor performance on TutorEval. Therefore, we create TutorChat, a dataset of 80,000 long synthetic dialogues about textbooks. We use TutorChat to fine-tune Llemma models with 7B and 34B parameters. These LM tutors specialized in math have a 32K-token context window, and they excel at TutorEval while performing strongly on GSM8K and MATH. Our datasets build on open-source materials, and we release our models, data, and evaluations publicly.
  • Local vs Global continual learning
    Item type: Conference Paper
    Lanzillotta, Giulia; Singh, Sidak Pal; Grewe, Benjamin; et al. (2025)
    Proceedings of Machine Learning Research ~ Proceedings of the 3rd Conference on Lifelong Learning Agents
    Continual learning is the problem of integrating new information in a model while retaining the knowledge acquired in the past. Despite the tangible improvements achieved in recent years, the problem of continual learning is still an open one. A better understanding of the mechanisms be hind the successes and failures of existing continual learning algorithms can unlock the development of new successful strategies. In this work, we view continual learning from the perspective of the multi-task loss approximation, and we compare two alternative strategies, namely local and global approximations. We classify existing continual learning algorithms based on the approximation used, and we assess the practical effects of this distinction in common continual learning settings. Additionally, we study optimal continual learning objectives in the case of local polynomial approx imations and we provide examples of existing algorithms implementing the optimal objectiv.
  • Chen, Le; Ao, Yunke; Tschopp, Florian; et al. (2021)
    Proceedings of Machine Learning Research ~ Proceedings of the 2020 Conference on Robot Learning
    Visual-inertial systems rely on precise calibrations of both camera intrinsics and inter-sensor extrinsics, which typically require manually performing complex motions in front of a calibration target. In this work we present a novel approach to obtain favorable trajectories for visual-inertial system calibration, using model-based deep reinforcement learning. Our key contribution is to model the calibration process as a Markov decision process and then use model-based deep reinforcement learning with particle swarm optimization to establish a sequence of calibration trajectories to be performed by a robot arm. Our experiments show that while maintaining similar or shorter path lengths, the trajectories generated by our learned policy result in lower calibration errors compared to random or handcrafted trajectories. The code is publicly available.
  • Achermann, Florian; Kolobov, Andrey; Dey, Debadeepta; et al. (2020)
    Proceedings of Machine Learning Research ~ Proceedings of the 2020 Conference on Robot Learning
    While optical cameras are ubiquitous in robotics, some robots can sense the world in several sections of the electromagnetic spectrum simultaneously, which can extend their capabilities in fundamental ways. For instance, many fixed-wing UAVs carry both optical and thermal imaging cameras, potentially allowing them to detect temperature difference-induced atmospheric updrafts, map their locations, and adjust their flight path accordingly to increase their time aloft. A key step for unlocking the potential offered by multi-spectral data is generating consistent, multi-spectral maps of the environment. In this work, we introduce MultiPoint, a novel data-driven method for generating interest points and associated descriptors for registering optical and thermal image pairs without knowledge of the relative camera viewpoints. Existing pixel-based alignment methods are accurate but too slow to work in near-real time, while feature-based methods such as SuperPoint are fast but produce poor-quality cross-spectral matches due to interest point instability in thermal images. MultiPoint capitalizes on the strengths of both approaches. An offline mutual information-based procedure is used to align cross-spectral image pairs from a training set, which are then processed by our generalized multi-spectral homographic adaptation stage to generate highly repeatable interest points that are invariant across viewpoint changes in both spectra. These are used to train a MultiPoint deep neural network by exposing this model to both same-spectrum and cross-spectral image pairs. This model is then deployed for fast and accurate online interest point detection. We show that MultiPoint outperforms existing techniques for feature-based image alignment using a dataset of real-world thermal-optical imagery captured by a UAV during flights in different conditions and release this dataset, the first of its kind.
  • Berndt, Marc; Agostini, Andrea; Stocker, Beatrice; et al. (2025)
    Proceedings of Machine Learning Research ~ Proceedings of the 10th Machine Learning for Healthcare Conference
    Accurate extraction of phenotypic information from clinical narratives is essential in diagnostic medicine, yet mapping free-text reports to structured Human Phenotype Ontology (HPO) terms remains challenging. While encoder-only transformer models and small decoder-only generative models are attractive for clinical deployment due to their efficiency and low resource requirements, the former often fail to capture the rich context of clinical texts, and the latter struggle to process lengthy reports effectively. In contrast, larger language models excel at contextual understanding but are impractical for clinical use due to their size, propensity to hallucinate, and privacy concerns associated with non-local inference. To overcome these challenges, we introduce PhenoRAG, a novel retrieval-augmented generation framework that leverages a synthetic database of contextually enriched sentences to augment a lightweight decoder-only model for accurate zero-shot phenotype identification. We demonstrate the capacity of PhenoRAG to capture nuanced contextual clues by 1) evaluating its ability to perform two clinically relevant tasks—guide rare disease diagnosis and facilitate urinary tract infection detection—and 2) validating its performance on a synthetic dataset designed to mimic the challenges of real clinical narratives. Experimental results demonstrate that our lightweight PhenoRAG framework achieves a higher F1-score than both encoder-only transformers and standalone small language models, driven primarily by its high recall. These findings underscore the potential of PhenoRAG as a ready-to-use clinical tool for phenotype identification.
  • Kühne, Marino; Grontas, Panagiotis D.; De Pasquale, Giulia; et al. (2025)
    Proceedings of Machine Learning Research ~ Proceedings of the 42nd International Conference on Machine Learning
    Although social networks have expanded the range of ideas and information accessible to users, they are also criticized for amplifying the polarization of user opinions. Given the inherent complexity of these phenomena, existing approaches to counteract these effects typically rely on handcrafted algorithms and heuristics. We propose an elegant solution: we act on the network weights that model user interactions on social networks (e.g., ranking of users’ shared content in feeds), to optimize a performance metric (e.g., minimize polarization), while users’ opinions follow the classical Friedkin-Johnsen model. Our formulation gives rise to a challenging, large-scale optimization problem with non-convex constraints, for which we develop a gradient-based algorithm. Our scheme is simple, scalable, and versatile, as it can readily integrate different, potentially non-convex, objectives. We demonstrate its merit by: (i) rapidly solving complex social network intervention problems with 4.8 million variables based on the Reddit, LiveJournal, and DBLP datasets; (ii) outperforming competing approaches in terms of both computation time and disagreement reduction.
  • Kernelized Synaptic Weight Matrices
    Item type: Conference Paper
    Müller, Lorenz; Martel, Julien; Indiveri, Giacomo (2018)
    Proceedings of Machine Learning Research ~ Proceedings of the 35th International Conference on Machine Learning (ICML 2018)
  • Dirren, Colin; Bianchi, Mattia; Grontas, Panagiotis D.; et al. (2025)
    Proceedings of Machine Learning Research ~ Proceedings of The 28th International Conference on Artificial Intelligence and Statistics
    We study the convex-concave bilinear saddle-point problem minx maxy f(x) + y⊤Ax-g(y), where both, only one, or none of the functions f and g are strongly convex, and suitable rank conditions on the matrix A hold. The solution of this problem is at the core of many machine learning tasks. By employing tools from monotone operator theory, we systematically prove the contractivity (in turn, the linear convergence) of several first-order primal-dual algorithms, including the Chambolle-Pock method. Our approach results in concise proofs, and it yields new convergence guarantees and tighter bounds compared to known results. Copyright 2025 by the author(s).
Publications 1 - 10 of 405