Kaggle.com - Kaggle is the world's largest data science community with powerful tools and resources to help you achieve your data science goals.
Kaggle
Loading...
$50,000 prize pool
Entry Deadline: August 25, 2026
Learn More ๐ www.kaggle.com/competitions...
The AI Agent Security - Multi-Step Tool Attacks simulation is now live!
In partnership with OpenAI, Google and IEEE, your challenge is to build an attack algorithm that stress-tests tool-using AI agents in a deterministic offline benchmark.
Show us how you'd take an idea and turn it into a working benchmark. We're picking 5 submissions to win exclusive swag and a social shoutout.
Top of the leaderboard:
๐ฅGemini 3.5 Flash: 80.2%
๐ฅGemini 3 Flash Preview: 79.2%
๐ฅGemini 2.5 Pro: 78.9%
Models must process raw frames as native tokens to locate seconds-long events hidden in an hour of video; accuracy scales logarithmically with frame density.
1H-VideoQA is now available on Kaggle Benchmarks!
Developed by Google DeepMind back in 2024 (Antoine Yang) and now updated with latest SOTA models, 1H-VideoQA is a 101-prompt benchmark for long-context video comprehension and temporal episodic reasoning across hour-long YouTube footage.
How to enter:
1๏ธโฃ Build a task locally with the write-kaggle-benchmarks skill
2๏ธโฃ Push it to Kaggle Benchmarks and run it
3๏ธโฃ Post your Task link plus a screenshot or video of how you did it
Must tag @kaggle. Deadline: July 1.
Last call to sign up! ๐ข
Registration closes on June 12, 11:59pm PT.
Don't miss this no-cost course featuring Google expert-led theory sessions, hands-on labs, a capstone challenge, and a global community of learners.
Register now:๐
Get started ๐
www.kaggle.com/benchmarks?t...
Check out the leaderboard here ๐
www.kaggle.com/benchmarks/d...