Inlay

//

by @danabra.mov

by @danabra.mov

by @jimpick.com

+ new component

Post

We expose critical flaws in existing negation benchmarks, introduce a new MLLM-as-a-judge evaluation, and show that simple task vector steering can massively boost negation performance! 🚀