Inlay

ProfilePosts

Could an AI company lose control of its own agents? To find out, Anthropic, Google, Meta, and OpenAI let us (1) test their best internal models with CoT access, (2) review non-public info about capabilities, alignment, and control. The result: our first Frontier Risk Report.

We surveyed 349 technical researchers, engineers, and managers (in February–April 2026) about how they use AI tools at work. On average, participants self-report that AI use made their work 1.6–2.1x more valuable, and that this multiplier will grow over time.

Cool profile of @metr.org’s work in the NYT today! Particularly like this from my colleague Ajeya: “METR is an organization that asks... what we think would be most valuable for the world to know about A.I. and its risks, and then the answers are what they are.” www.nytimes.com/2026/04/17/t....

More on this idea here: metr.org/blog/2024-11...

Worth reading this in full. I come in skeptical, but this basically is a claim that an AI system at Alibaba attempted autonomous replication without human intervention. This excerpt was found and highlighted by Alexander Long. Full paper here: arxiv.org/abs/2512.24873

We’re correcting a mistake in our modeling that inflated recent 50%-time horizons by 10-20% (and reduced 80%-horizons). We inappropriately penalized steepness in task-length→success curve fits. This most affects the oldest and newest models, whose fits are less data-constrained.

Since early 2025, we've been studying how AI tools impact productivity among developers. Previously, we found a 20% slowdown. That finding is now outdated. Speedups now seem likely, but changes in developer behavior make our new results unreliable. We’re working to address this.

metr.org/careers

Our team is stretched thin at the moment! To continue upper-bounding the autonomy of AI agents, and developing evaluations for monitoring AI systems and their propensity to subvert human control, we need more great engineering and research staff. Please apply below or DM me!

Groundhog Day is a very METR-y holiday. Small animal emerges from a cave for only a moment, shares a forecast about timelines that's somewhat difficult to interpret, and then retreats into his cave for another year.