Perturbation-aware representation learning for <i>in vivo</i> genetic screens
OPEN ACCESS
Loading...
Author / Producer
Date
2025-10-15
Publication Type
Working Paper
ETH Bibliography
yes
Citations
Altmetric
OPEN ACCESS
Data
Abstract
CRISPR-based genetic perturbation screens paired with single-cell transcriptomic readouts (Perturb-seq) offer a powerful tool for interrogating biological systems. Yet the resulting datasets are heterogeneous—particularly in vivo—and currently used cell-level perturbation labels reflect only CRISPR guide RNA exposure rather than perturbation state; further, many perturbations have a minimal effect on gene expression. For perturbations that do alter the transcriptomic state of cells, intracellular guide RNA abundance exhibits a dose-response association with perturbation efficacy. We combine (i) per-perturbation, expression-only classifiers trained with non-negative negative–unlabeled (nnNU) risk to yield calibrated scores reflecting the perturbation state of single cells and (ii) a monotone guide abundance prior to yield a per-cell pseudo-posterior that supports both assignment of perturbation probability and selection of affected gene features. To obtain a low-dimensional representation that allows for the accurate reconstruction of gene-level marginals for counterfactual decoding, we train an autoencoder with a quantile–hurdle reconstruction loss and feature-weighted emphasis on perturbation-affected genes. The result is a perturbation-aware latent embedding amenable to downstream trajectory modeling (e.g., optimal transport or flow matching) and a principled probability of perturbation for each non-control cell derived jointly from its guide counts and transcriptome.
Permanent link
Publication status
published
External links
Editor
Book title
Journal / series
Volume
Pages / Article No.
Publisher
Cold Spring Harbor Laboratory
Event
Edition / version
v1
Methods
Software
Geographic location
Date collected
Date created
Subject
Organisational unit
02150 - Dep. Informatik / Dep. of Computer Science