Benedikt Knüsel


Loading...

Last Name

Knüsel

First Name

Benedikt

Organisational unit

02890 - Albert Einstein School of Public Policy / Albert Einstein School of Public Policy

Search Results

Publications 1 - 9 of 9
  • Muratore, Stefano; Müller, Stefanie; Kulla, Henry; et al. (2016)
    USYS TdLab Fallstudie
  • Knüsel, Benedikt (2020)
    Recent years have seen a dramatic increase in the volumes of data that are produced, stored, and analyzed. This advent of big data has led to commercial success stories, for example in recommender systems in online shops. However, scientific research in various disciplines including environmental and climate science will likely also benefit from increasing volumes of data, new sources for data, and the increasing use of algorithmic approaches to analyze these large datasets. This thesis uses tools from philosophy of science to conceptually address epistemological questions that arise in the analysis of these increasing volumes of data in environmental science with a special focus on data-driven modeling in climate research. Data-driven models, here, are defined as models of phenomena that are built with machine learning. While epistemological analyses of machine learning exist, these have mostly been conducted for fields characterized by a lack of hierarchies of theoretical background knowledge. Such knowledge is often available in environmental science and especially in physical climate science, and it is relevant for the construction, evaluation, and use of data-driven models. This thesis investigates predictions, uncertainty, and understanding from data-driven models in environmental and climate research and engages in in-depth discussions of case studies. These three topics are discussed in three topical chapters. The first chapter addresses the term “big data”, and rationales and conditions for the use of big-data elements for predictions. Namely, it uses a framework for classifying case studies from climate research and shows that “big data” can refer to a range of different activities. Based on this classification, it shows that most case studies lie in between classical domain science and pure big data. The chapter specifies necessary conditions for the use of big data and shows that in most scientific applications, background knowledge is essential to argue for the constancy of the identified relationships. This constancy assumption is relevant both for new forms of measurements and for data-driven models. Two rationales for the use of big-data elements are identified. Namely, big-data elements can help to overcome limitations in financial, computational, or time resources, which is referred to as the rationale of efficiency. Big-data elements can also help to build models when system understanding does not allow for a more theory-guided modeling approach, which is referred to as the epistemic rationale. The second chapter addresses the question of predictive uncertainties of data-driven models. It highlights that existing frameworks for understanding and characterizing uncertainty focus on specific locations of uncertainty, which are not informative for the predictive uncertainty of data-driven models. Hence, new approaches are needed for this task. A framework is developed and presented that focuses on the justification of the fitness-for-purpose of the models for the specific kind of prediction at hand. This framework uses argument-based tools and distinguishes between first-order and second-order epistemic uncertainty. First-order uncertainty emerges when it cannot be conclusively justified that the model is maximally fit-for-purpose. Second-order uncertainty emerges when it is unclear to what extent the fitness-for-purpose assumption and the underlying assumptions are justified. The application of the framework is illustrated by discussing a case study of data-driven projections of the impact of climate change on global soil selenium concentrations. The chapter also touches upon how the information emerging from the framework can be used in decision-making. The third chapter addresses the question of scientific understanding. A framework is developed for assessing the fitness of a model for providing understanding of a phenomenon. For this, the framework draws from the philosophical literature on scientific understanding and focuses on the representational accuracy, the representational depth, and the graspability of a model. Then, based on the framework, the fitness of data-driven and process-based climate models for providing understanding of phenomena is compared. It is concluded that data-driven models can, under some conditions, be fit to serve as vehicles for understanding to a satisfactory extent. This is specifically the case when sufficient background knowledge is available such that the coherence of the model with background knowledge provides good reasons for the representational accuracy of the data-driven model, which can be assessed e.g. through sensitivity analyses. This point is illustrated by discussing a case study from atmospheric physics in which data-driven models are used to better understand the drivers of a specific type of clouds. The work of this thesis highlights that while big data is no panacea for scientific research, data-driven modeling offers new tools to scientists that can be very useful for a variety of questions. All three studies emphasize the importance of background knowledge for the construction and evaluation of data-driven models as this helps to obtain models that are representationally accurate. The importance of domain-specific background knowledge and the technical challenges of implementing data-driven models for complex phenomena highlight the importance of interdisciplinary work. Previous philosophical work on machine learning has stressed that the problem framing makes models theory-laden. This thesis shows that in a field like climate research, the model evaluation is strongly guided by theoretical background knowledge, which is also important for the theory-ladenness of data-driven modeling. The results of the thesis are relevant for a range of methodological questions regarding data-driven modeling and for philosophical discussions of models that go beyond data-driven models.
  • Knüsel, Benedikt (2020)
    Ethics, Policy & Environment
  • Knüsel, Benedikt; Zumwald, Marius; Baumberger, Christoph; et al. (2019)
    Nature Climate Change
    Commercial success of big data has led to speculation that big-data-like reasoning could partly replace theory-based approaches in science. Big data typically has been applied to ‘small problems’, which are well-structured cases characterized by repeated evaluation of predictions. Here, we show that in climate research, intermediate categories exist between classical domain science and big data, and that big-data elements have also been applied without the possibility of repeated evaluation. Big-data elements can be useful for climate research beyond small problems if combined with more traditional approaches based on domain-specific knowledge. The biggest potential for big-data elements, we argue, lies in socioeconomic climate research.
  • Knüsel, Benedikt; Baumberger, Christoph; Zumwald, Marius; et al. (2020)
    Environmental Modelling & Software
    © 2020 The Authors Increasing volumes of data allow environmental scientists to use machine learning to construct data-driven models of phenomena. These models can provide decision-relevant predictions, but confident decision-making requires that the involved uncertainties are understood. We argue that existing frameworks for characterizing uncertainties are not appropriate for data-driven models because of their focus on distinct locations of uncertainty. We propose a framework for uncertainty assessment that uses argument analysis to assess the justification of the assumption that the model is fit for the predictive purpose at hand. Its flexibility makes the framework applicable to data-driven models. The framework is illustrated using a case study from environmental science. We show that data-driven models can be subject to substantial second-order uncertainty, i.e., uncertainty in the assessment of the predictive uncertainty, because they are often applied to ill-understood problems. We close by discussing the implications of the predictive uncertainties of data-driven models for decision-making.
  • Knüsel, Benedikt; Baumberger, Christoph (2020)
    Studies in History and Philosophy of Science Part A
    In climate science, climate models are one of the main tools for understanding phenomena. Here, we develop a framework to assess the fitness of a climate model for providing understanding. The framework is based on three dimensions: representational accuracy, representational depth, and graspability. We show that this framework does justice to the intuition that classical process-based climate models give understanding of phenomena. While simple climate models are characterized by a larger graspability, state-of-the-art models have a higher representational accuracy and representational depth. We then compare the fitness-for-providing understanding of process-based to data-driven models that are built with machine learning. We show that at first glance, data-driven models seem either unnecessary or inadequate for understanding. However, a case study from atmospheric research demonstrates that this is a false dilemma. Data-driven models can be useful tools for understanding, specifically for phenomena for which scientists can argue from the coherence of the models with background knowledge to their representational accuracy and for which the model complexity can be reduced such that they are graspable to a satisfactory extent.
  • Zumwald, Marius; Knüsel, Benedikt; Bresch, David N.; et al. (2021)
    Urban Climate
    Understanding the patterns of urban temperature a high spatial and temporal resolution is of large importance for urban heat adaptation and mitigation. Machine learning offers promising tools for high-resolution modeling of urban heat, but it requires large amounts of data. Measurements from official weather stations are too sparse but could be complemented by crowd-sensed measurements from citizen weather stations (CWS). Here we present an approach to model urban temperature using the quantile regression forest algorithm and CWS, open government and remote sensing data. The analysis is based on data from 691 sensors in the city of Zurich (Switzerland) during a heat wave using data from for 25-30th June 2019. We trained the model using hourly data from for 25-29th June (n = 71,837) and evaluate the model using data from June 30th (n = 14,105). Based on the model, spatiotemporal temperature maps of 10 × 10 m resolution were produced. We demonstrate that our approach can accurately map urban heat at high spatial and temporal resolution without additional measurement infrastructure. We furthermore critically discuss and spatially map estimated prediction and extrapolation uncertainty. Our approach is able to inform highly localized urban policy and decision-making.
  • Zumwald, Marius; Knüsel, Benedikt; Baumberger, Christoph; et al. (2020)
    WIREs Climate Change
    In climate science, observational gridded climate datasets that are based on in situ measurements serve as evidence for scientific claims and they are used to both calibrate and evaluate models. However, datasets only represent selected aspects of the real world, so when they are used for a specific purpose they can be a source of uncertainty. Here, we present a framework for understanding this uncertainty of observational datasets which distinguishes three general sources of uncertainty: (1) uncertainty that arises during the generation of the dataset; (2) uncertainty due to biased samples; and (3) uncertainty that arises due to the choice of abstract properties, such as resolution and metric. Based on this framework, we identify four different types of dataset ensembles—parametric, structural, resampling, and property ensembles—as tools to understand and assess uncertainties arising from the use of datasets for a specific purpose. We advocate for a more systematic generation of dataset ensembles by using these sorts of tools. Finally, we discuss the use of dataset ensembles in climate model evaluation. We argue that a more systematic understanding and assessment of dataset uncertainty is needed to allow for a more reliable uncertainty assessment in the context of model evaluation. The more systematic use of such a framework would be beneficial for both scientific reasoning and scientific policy advice based on climate datasets.
  • Climate Research and Big Data
    Item type: Book Chapter
    Knüsel, Benedikt; Baumberger, Christoph; Knutti, Reto (2023)
    Handbook of the Philosophy of Climate Change
    In recent years, the ability to gather and store information has increased dramatically, and the ability to make use of these increasing volumes of data has improved. This advent of big data has opened up new opportunities for scientific research, including for research on climate change. These changes are associated with a number of interesting philosophical questions. This chapter provides an introduction to these questions. It starts by first clarifying terminological issues concerning “big data” and related terms and by giving an overview of big data elements that can be found in climate research. Second, it discusses data in climate research with a focus on new developments regarding the increase in the volume and complexity of climate data and on how the uncertainty of climate datasets may be assessed. Finally, the chapter addresses the topic of machine learning in climate research and specifically the use of machine learning for the data-driven modeling of climate phenomena. The focus of this discussion is on the representational accuracy of data-driven models and how it might be assessed and what this implies for their use for predictions and for understanding.
Publications 1 - 9 of 9