Surprising: frontier models (Claude, ChatGPT, Deepseek V4) produce the most predictable text of any local AI model or human text I've ever tested.
Not surprising: Finnegans Wake is off the charts, by far least predictable—Shannon & Lydia Liu proved right—and Hemingway the most predictable human.
Ryan Heuser
Shannon measured the information rate of English at ~1 bit per character. According to a byte-level LLM measuring next-character predictability in LLM & human text (diaries, abstracts, dreams, fiction), aligned models produce sub-English information rates & LLM text is more predictable than humans'.