š SEA-VL Phase 2 (Anthropogenic Regional Adaptation in Multimodal Vision-Language Model) is now available on arXiv!
š Paper: arxiv.org/abs/2604.11490
š Project page: seacrowd.org/projects/202...
While the field of vision-language (VL) has achieved remarkable success in integrating visual and textual information across multiple languages and domains, there is still no dedicated framework for a...