Journal: Journal of the American Statistical Association
Loading...
Abbreviation
J. Am. Stat. Assoc
Publisher
Taylor & Francis
17 results
Search Results
Publications 1 - 10 of 17
- Hierarchical Testing in the High-Dimensional Setting With Correlated VariablesItem type: Journal Article
Journal of the American Statistical AssociationMandozzi, Jacopo; Bühlmann, Peter (2016) - Iterative Methods for Vecchia-Laplace Approximations for Latent Gaussian Process ModelsItem type: Journal Article
Journal of the American Statistical AssociationKündig, Pascal; Sigrist, Fabio (2025)Latent Gaussian process (GP) models are flexible probabilistic nonparametric function models. Vecchia approximations are accurate approximations for GPs to overcome computational bottlenecks for large data, and the Laplace approximation is a fast method with asymptotic convergence guarantees to approximate marginal likelihoods and posterior predictive distributions for non-Gaussian likelihoods. Unfortunately, the computational complexity of combined Vecchia-Laplace approximations grows faster than linearly in the sample size when used in combination with direct solver methods such as the Cholesky decomposition. Computations with Vecchia-Laplace approximations can thus become prohibitively slow precisely when the approximations are usually the most accurate, that is, on large datasets. In this article, we present iterative methods to overcome this drawback. Among other things, we introduce and analyze several preconditioners, derive new convergence results, and propose novel methods for accurately approximating predictive variances. We analyze our proposed methods theoretically and in experiments with simulated and real-world data. In particular, we obtain a speed-up of an order of magnitude compared to Cholesky-based calculations and a 3-fold increase in prediction accuracy in terms of the continuous ranked probability score compared to a state-of-the-art method on a large satellite dataset. All methods are implemented in a free C++ software library with high-level Python and R packages. Supplementary materials for this article are available online, including a standardized description of the materials available for reproducing the work. - Space-Time Extremes of Severe U.S. Thunderstorm EnvironmentsItem type: Journal Article
Journal of the American Statistical AssociationKoh, Jonathan; Koch, Erwan; Davison, Anthony C. (2025)Severe thunderstorms cause substantial economic and human losses in the United States. Simultaneous high values of convective available potential energy (CAPE) and storm relative helicity (SRH) are favorable to severe weather, and both they and the composite variable PROD=√CAPE×SRH can be used as indicators of severe thunderstorm activity. Their extremal spatial dependence exhibits temporal non-stationarity due to seasonality and large-scale atmospheric signals such as El Niño-Southern Oscillation (ENSO). In order to investigate this, we introduce a space-time model based on a max-stable, Brown–Resnick, field whose range depends on ENSO and on time through a tensor product spline. We also propose a max-stability test based on empirical likelihood and the bootstrap. The marginal and dependence parameters must be estimated separately owing to the complexity of the model, and we develop a bootstrap-based model selection criterion that accounts for the marginal uncertainty when choosing the dependence model. In the case study, the out-sample performance of our model is good. We find that extremes of PROD, CAPE, and SRH are generally more localized in summer and, in some regions, less localized during El Niño and La Niña events, and give meteorological interpretations of these phenomena. Supplementary materials for this article are available online, including a standardized description of the materials available for reproducing the work. - Higher-Order Least Squares: Assessing Partial Goodness of Fit of Linear Causal ModelsItem type: Journal Article
Journal of the American Statistical AssociationSchultheiss, Christoph; Bühlmann, Peter; Yuan, Ming (2024)We introduce a simple diagnostic test for assessing the overall or partial goodness of fit of a linear causal model with errors being independent of the covariates. In particular, we consider situations where hidden confounding is potentially present. We develop a method and discuss its capability to distinguish between covariates that are confounded with the response by latent variables and those that are not. Thus, we provide a test and methodology for partial goodness of fit. The test is based on comparing a novel higher-order least squares principle with ordinary least squares. In spite of its simplicity, the proposed method is extremely general and is also proven to be valid for high-dimensional settings. for this article are available online. - Toward Causal Inference for Spatio-Temporal Data: Conflict and Forest Loss in ColombiaItem type: Journal Article
Journal of the American Statistical AssociationChristiansen, Rune; Baumann, Matthias; Kuemmerle, Tobias; et al. (2022)How does armed conflict influence tropical forest loss? For Colombia, both enhancing and reducing effect estimates have been reported. However, a lack of causal methodology has prevented establishing clear causal links between these two variables. In this work, we propose a class of causal models for spatio-temporal stochastic processes which allows us to formally define and quantify the causal effect of a vector of covariates X on a real-valued response Y. We introduce a procedure for estimating causal effects and a nonparametric hypothesis test for these effects being zero. Our application is based on geospatial information on conflict events and remote-sensing-based data on forest loss between 2000 and 2018 in Colombia. Across the entire country, we estimate the effect to be slightly negative (conflict reduces forest loss) but insignificant (P = 0.578), while at the provincial level, we find both positive effects (e.g., La Guajira, P = 0.047) and negative effects (e.g., Magdalena, P = 0.004). The proposed methods do not make strong distributional assumptions, and allow for arbitrarily many latent confounders, given that these confounders do not vary across time. Our theoretical findings are supported by simulations, and code is available online. - Partition MCMC for Inference on Acyclic DigraphsItem type: Journal Article
Journal of the American Statistical AssociationKuipers, Jack; Moffa, Giusi (2017) - Covariate-Informed Latent Interaction Models: Addressing Geographic & Taxonomic Bias in Predicting Bird-Plant InteractionsItem type: Journal Article
Journal of the American Statistical AssociationPapadogeorgou, Georgia; Bello, Carolina; Ovaskainen, Otso; et al. (2023)Reductions in natural habitats urge that we better understand species' interconnection and how biological communities respond to environmental changes. However, ecological studies of species' interactions are limited by their geographic and taxonomic focus which can distort our understanding of interaction dynamics. We focus on bird-plant interactions that refer to situations of potential fruit consumption and seed dispersal. We develop an approach for predicting species' interactions that accounts for errors in the recorded interaction networks, addresses the geographic and taxonomic biases of existing studies, is based on latent factors to increase flexibility and borrow information across species, incorporates covariates in a flexible manner to inform the latent factors, and uses a meta-analysis dataset from 85 individual studies. We focus on interactions among 232 birds and 511 plants in the Atlantic Forest, and identify 5% of pairs of species with an unrecorded interaction, but posterior probability that the interaction is possible over 80%. Finally, we develop a permutation-based variable importance procedure for latent factor network models and identify that a bird's body mass and a plant's fruit diameter are important in driving the presence of species interactions, with a multiplicative relationship that exhibits both a thresholding and a matching behavior. for this article are available online. - Graphical Model Selection for Gaussian Conditional Random Fields in the Presence of Latent VariablesItem type: Journal Article
Journal of the American Statistical AssociationFrot, Benjamin; Jostins, Luke; McVean, Gilean (2019)We consider the problem of learning a conditional Gaussian graphical model in the presence of latent variables. Building on recent advances in this field, we suggest a method that decomposes the parameters of a conditional Markov random field into the sum of a sparse and a low-rank matrix. We derive convergence bounds for this estimator and show that it is well-behaved in the high-dimensional regime as well as “sparsistent” (i.e., capable of recovering the graph structure). We then show how proximal gradient algorithms and semi-definite programming techniques can be employed to fit the model to thousands of variables. Through extensive simulations, we illustrate the conditions required for identifiability and show that there is a wide range of situations in which this model performs significantly better than its counterparts, for example, by accommodating more latent variables. Finally, the suggested method is applied to two datasets comprising individual level data on genetic variants and metabolites levels. We show our results replicate better than alternative approaches and show enriched biological signal. Supplementary materials for this article are available online. - Model-Based Causal Feature Selection for General Response TypesItem type: Journal Article
Journal of the American Statistical AssociationKook, Lucas; Saengkyongam, Sorawit; Lundborg, Anton Rask; et al. (2025)Discovering causal relationships from observational data is a fundamental yet challenging task. Invariant causal prediction (ICP, Peters, B & uuml;hlmann, and Meinshausen) is a method for causal feature selection which requires data from heterogeneous settings and exploits that causal models are invariant. ICP has been extended to general additive noise models and to nonparametric settings using conditional independence tests. However, the latter often suffer from low power (or poor Type I error control) and additive noise models are not suitable for applications in which the response is not measured on a continuous scale, but reflects categories or counts. Here, we develop transformation-model (tram) based ICP, allowing for continuous, categorical, count-type, and uninformatively censored responses (these model classes, generally, do not allow for identifiability when there is no exogenous heterogeneity). As an invariance test, we propose tram-GCM based on the expected conditional covariance between environments and score residuals with uniform asymptotic level guarantees. For the special case of linear shift trams, we also consider tram-Wald, which tests invariance based on the Wald statistic. We provide an open-source R package tramicp and evaluate our approach on simulated data and in a case study investigating causal features of survival in critically ill patients. Supplementary materials for this article are available online, including a standardized description of the materials available for reproducing the work. - Spatial Modeling and Future Projection of Extreme Precipitation ExtentsItem type: Journal Article
Journal of the American Statistical AssociationZhong, Peng; Brunner, Manuela; Opitz, Thomas; et al. (2025)Extreme precipitation events with large spatial extents may have more severe impacts than localized events as they can lead to widespread flooding. It is debated how climate change may affect the spatial extent of precipitation extremes, whose investigation often directly relies on simulations of precipitation from climate models. Here, we use a different strategy to investigate how future changes in spatial extents of precipitation extremes differ across climate zones and seasons in two river basins (Danube and Mississippi). We rely on observed precipitation extremes while exploiting a physics-based average-temperature covariate, enabling us to project future precipitation extents based on projected temperatures. We include the covariate into newly developed time-varying r-Pareto processes using suitably chosen spatial risk functionals r. This model captures temporal non-stationarity in the spatial dependence structure of precipitation extremes by linking it to the temperature covariate, derived from reanalysis data (ERA5-Land) for model calibration and from bias-corrected climate simulations (CMIP6) for projections. Our results show an increasing trend in the margins, with both significantly positive or negative trend coefficients depending on season and river (sub-)basin. During major rainy seasons, the significant trends indicate that future spatial extreme events will become relatively more intense and localized in several sub-basins. Supplementary materials for this article are available online, including a standardized description of the materials available for reproducing the work.
Publications 1 - 10 of 17