//
sign in
Post
by @danabra.mov
PostEmbed
by @danabra.mov
Record
by @jimpick.com
Record
by @atsui.org
+ new component
Post
Our team is stretched thin at the moment! To continue upper-bounding the autonomy of AI agents, and developing evaluations for monitoring AI systems and their propensity to subvert human control, we need more great engineering and research staff. Please apply below or DM me!
4mo
Chris Painter
We estimate that Claude Opus 4.6 has a 50%-time-horizon of around 14.5 hours (95% CI of 6 hrs to 98 hrs) on software tasks. While this is the highest point estimate we’ve reported, this measurement is extremely noisy because our current task suite is nearly saturated.
4mo
METR