at://
/
app.bsky.feed.post
/
3m52kvdb2os22
sign in
All
4
Record
2
Post
1
PostEmbed
1
Post
by @danabra.mov
PostEmbed
by @danabra.mov
Record
by @jimpick.com
Record
by @atsui.org
+ new component
Post
Can LLMs accurately aggregate information over long, information-dense texts? Not yet… We introduce Oolong, a dataset of simple-to-verify information aggregation questions over long inputs. No model achieves >50% accuracy at 128K on Oolong!
7mo
Amanda Bertsch