Inlay

ProfileReplies

🌎 paulgavrikov.github.io/visualoverload Joint work with Wei Lin, M. Jehanzeb Mirza, Soumya Jahagirdar, Muhammad Huzaifa, Sivan Doveh, Serena Yeung-Levy, James Glass, Hilde Kuehne.

Do Vision-Language Models (VLMs) actually "see" everything in a crowded room? 🔍 Today at #CVPR2026, we are presenting VisualOverload, our work exploring the critical visual perception bottlenecks of VLMs in dense scenes. 📍 Today (Poster Session 6), 5:30 PM - 7:30 PM, Poster 431 (ExHall A)

Visit our #CVPR2026 poster #179 at 11:50-12:30 to learn about issues and solutions for negation in CLIP. Work led by Fawaz Sammani and Tzoulio Chamiti.

This is the first time I fully vibecoded a tool, and it was impressive how far I got in the little time I invested. Claude (Antigravity) did not "one-shot" this, but the few bugs I found were smaller details. Give it a try! github.com/paulgavrikov...

Meet Slurm Manager: a self-hosted web dashboard for Slurm clusters. Connect via SSH, monitor nodes & jobs in real time, submit scripts, view fairshare quotas — all from your browser. Basically, a handy wrapper over Slurm commands via SSH.

2mo

10d

2mo

The paper introduces VisualOverload, a new visual question answering (VQA) benchmark designed to test vision-language models (VLMs) on densely populated, detail-rich scenes using public-domain paintin...

paulgavrikov.github.io

VisualOverload: Probing Visual Understanding of VLMs in Really Dense Scenes

Sure, but except for the desk rejects there’s no feedback to optimize on. You start reviewing (good or bad) and just keep doing what you did. I think it would be great to provide at least some high level feedback or scores.

That was our inspiration :)

Will it be shared with reviewers? I think some kind of feedback would be great, especially for first time reviewers