Dylan, on life in your 80’s, is incredible.
@nytimes.com
www.nytimes.com/2026/06/14/o...
We don’t always know what problems are hard for LLMs. So devs evaluate on tasks HUMANS find hard or on broad benchmarks. What if we could instead anticipate which scenarios a model will fail on—all without evaluating specific input examples?
🧵NEW PAPER by @jenniferlumeng.bsky.social
Don’t ban kids using YouTube.
Make YouTube behave like a broadcaster.
Pay creators. Reinvest in the creative industries. Pay fair tax back into creators’ economies. Follow guidelines on fair usage and copyright. Play fair.
It’s bigger than TV now. Make it behave like TV.