Inlay

//

Profile

Loading...

MSCA AI Postdoctoral Fellow @ Visual Recognition Group, Czech Technical University in Prague. Photographer. Crossfit freak. 📍Prague, CZ. 🔗 https://billpsomas.github.io/

Bill Psomas @CVPR

Loading...

🚨 Presenting today at #CVPR2026! Our highlight ⭐ paper Retrieve and Segment (RNS) asks a simple question: Can a few examples bridge the supervision gap in open-vocabulary segmentation? ✅ Turns out they can. 📍 Poster #578 ⏰ 16:45–18:45 📄 arxiv.org/abs/2602.23339

8d

Bill Psomas @CVPR

📍 @c1rcuslegend.bsky.social is presenting our work at #CVPR2026 Findings, find the poster with the most 🐈🐾 🗓 Friday, 07:00–08:30 📌 ExHall A; Poster #166 Stop by if you're curious whether MLLMs make good classifiers or to discuss our "Doomed to Reannotate" ImageNet project (preprint coming soon) 📝

11d

Vision Transformers are brittle to variable input resolutions. In dense prediction tasks, the standard fix is sliding-window inference with heavy overlap, which is effective but painfully slow. SPAR (Single-Pass Any-Resolution ViT), takes a different approach.

The instance level recognition and generation workshop will be back at @eccv.bsky.social 2026. With excellent keynote speakers, a call for papers, and travel grants for students.

Indexing Multimodal Language Models for Large-scale Image Retrieval - CVPR 2026 Findings paper A multimodal LLM (like Qwen) can estimate image-to-image similarity remarkably well without any task-specific training. 📍 Main conference — Poster #12, ExHall A 📅 Sun 7/6, 7:30–9:00