Inlay

//

ProfilePosts

Loading...

Vision Transformers are brittle to variable input resolutions. In dense prediction tasks, the standard fix is sliding-window inference with heavy overlap, which is effective but painfully slow. SPAR (Single-Pass Any-Resolution ViT), takes a different approach.

Kevin Kia The Two Doors

The instance level recognition and generation workshop will be back at @eccv.bsky.social 2026. With excellent keynote speakers, a call for papers, and travel grants for students.

20d

14d

7d

Our work Global-Aware Edge Prioritization for Pose Graph Initialization is a CVPR 2026 oral paper and award candidate. Oral: Sun, Jun 7, 10:15 — 11:30 at Bluebird Ballroom (Oral Session 5A: Dynamic Perception) 📎 Poster: Sun, Jun 7, 11:45 AM — 1:45 PM at ExHall F (Poster Session 5), Poster #2

Indexing Multimodal Language Models for Large-scale Image Retrieval - CVPR 2026 Findings paper A multimodal LLM (like Qwen) can estimate image-to-image similarity remarkably well without any task-specific training. 📍 Main conference — Poster #12, ExHall A 📅 Sun 7/6, 7:30–9:00