//
sign in
Post
by @danabra.mov
PostEmbed
by @danabra.mov
Record
by @jimpick.com
Record
by @atsui.org
+ new component
Post
jamesbrandecon.github.io/blog/posts_h... So after having GPT-5.4 agents review 4,800 proofs and GPT-5.5 give a second review to a random subset of them, what did we find? To my absolute surprise, if anything the rate of errors in proofs has increased(!) in 2025 relative to previous years.
2h
Paul Goldsmith-Pinkham