MSCA AI Postdoctoral Fellow @ Visual Recognition Group, Czech Technical University in Prague. Photographer. Crossfit freak.
๐Prague, CZ. ๐ https://billpsomas.github.io/
Bill Psomas @CVPR
Loading...
๐จ Presenting today at #CVPR2026!
Our highlight โญ paper Retrieve and Segment (RNS) asks a simple question:
Can a few examples bridge the supervision gap in open-vocabulary segmentation?
โ Turns out they can.
๐ Poster #578
โฐ 16:45โ18:45
๐ arxiv.org/abs/2602.23339
Bill Psomas @CVPR
๐ @c1rcuslegend.bsky.social is presenting our work at #CVPR2026 Findings, find the poster with the most ๐๐พ
๐ Friday, 07:00โ08:30
๐ ExHall A; Poster #166
Stop by if you're curious whether MLLMs make good classifiers or to discuss our "Doomed to Reannotate" ImageNet project (preprint coming soon) ๐
Vision Transformers are brittle to variable input resolutions. In dense prediction tasks, the standard fix is sliding-window inference with heavy overlap, which is effective but painfully slow.
SPAR (Single-Pass Any-Resolution ViT), takes a different approach.
The instance level recognition and generation workshop will be back at @eccv.bsky.social 2026. With excellent keynote speakers, a call for papers, and travel grants for students.
Indexing Multimodal Language Models for Large-scale Image Retrieval - CVPR 2026 Findings paper
A multimodal LLM (like Qwen) can estimate image-to-image similarity remarkably well without any task-specific training.
๐ Main conference โ Poster #12, ExHall A
๐ Sun 7/6, 7:30โ9:00