Ultimately, our results reveal a key limitation of current crop-based representation learning models for single-cell analysis, and underscore the need for methods that learn background-invariant single-cell features.
The background sensitivity of models has consequences for reliable single-cell analysis.
When background cells are masked, predicted localization distributions for multi-localized proteins, or proteins where localization changes from cell to cell, differ substantially for ~15% of proteins.
Deep learning methods are widely used to analyze microscopy images. But many operate on crops of images that, while centered around a single cell, may include other cells in the background, raising the question:
Are these models truly capturing single-cell features?
We focus on representation learning models trained to capture protein localization, the subcellular component within the cell where a protein resides, in yeast cells.
Arushi Gupta
This matters because understanding single-cell variability is key in biology. If models are altering their behavior depending on background, they are not robust for biological analyses. With @alexijie.bsky.social and Alan Moses, we introduce a framework to systematically test for this.
We find that three leading single-cell models—PIFiA, Paired Cell Inpainting, and DeepLoc—are sensitive to background context.
Classification accuracy drops by up to 15.8% when background cell localization differs from the center cell, and improves when they match.
To assess if background alters model behavior, we evaluate five settings where the center cell is fixed while the background is either unaltered, swapped with backgrounds from the same or different protein localization classes, replaced with a cell-free background, or masked out.
This work began from a summer internship at @msftresearch.bsky.social New England. Incredibly thankful to mentors @alexijie.bsky.social and Alan Moses, whose deep involvement and guidance made this possible!
📄: www.biorxiv.org/content/10.1...
💻: tinyurl.com/microscopy-c...
Arushi Gupta
Arushi Gupta
Arushi Gupta
Deep learning models are widely used to extract feature representations from microscopy images. While these models are used for single-cell analyses, such as studying single-cell heterogeneity, they t...