at://
/
app.bsky.feed.post
/
3mnabsdjjbs2m
sign in
All
4
Record
2
Post
1
PostEmbed
1
Post
by @danabra.mov
PostEmbed
by @danabra.mov
Record
by @jimpick.com
Record
by @atsui.org
+ new component
Post
Most brain-encoding pipelines pull vision, audio, and language features from separate models, then fuse them late, at the readout. But modern foundation models fuse modalities during pretraining. Which kind of fusion is actually more brain-relevant?