//
sign in
Profile
by @danabra.mov
Profile
by @dansshadow.bsky.social
Profile
by @jimpick.com
AviHandle
by @danabra.mov
AviHandle
by @dansshadow.bsky.social
AviHandle
by @katherine.computer
EventsList
by @katherine.computer
ProfileHeader
by @dansshadow.bsky.social
ProfileHeader
by @danabra.mov
ProfileMedia
by @danabra.mov
ProfilePlays
by @danabra.mov
ProfilePosts
by @danabra.mov
ProfilePosts
by @dansshadow.bsky.social
ProfileReplies
by @danabra.mov
Record
by @atsui.org
Skircle
by @danabra.mov
StreamPlacePlaylist
by @katherine.computer
+ new component
ProfilePosts






Loading...
๐ŸŽง Listen to the episode! ๐ŸŽฌ YouTube: www.youtube.com/watch?v=3QXH... ๐ŸŽ™๏ธ Spotify: open.spotify.com/episode/1aWC... ๐ŸŽ Apple: podcasts.apple.com/ca/podcast/1... ๐Ÿ“„ Paper: arxiv.org/pdf/2601.11778 #WiAIR #MultilingualAI #LLMs #MachineTranslation #NLProc
๐ŸŽ™๏ธ ๐๐ž๐ฐ #๐–๐ข๐€๐ˆ๐‘ ๐„๐ฉ๐ข๐ฌ๐จ๐๐ž ๐Ž๐ฎ๐ญ! In the new #WiAIRpodcast episode with @neuranna.bsky.social, we talk about the relationship between language, thought, and intelligence, with insights from neuroscience, cognitive science, and AI research. ๐Ÿ“ท YouTube: youtu.be/e36ryy0Dsdo
After a break, the #WiAIR Women in AI Research Podcast is back! Our next guest is Anna Ivanova @neuranna.bsky.social from Georgia Tech, whose research tackles a fundamental question in AI and cognitive science: ๐Ÿง  What is the relationship between language and thought? Don't miss!
1mo
23h
1d
www.youtube.com
YouTube video by Women in AI Research WiAIR
100% Jailbreak Success? The Hard Truth About AI Safety, with Dr. Saadia Gabriel (Part 2)
The paper evaluates 14 LLMs across 5 model families on 9 multilingual benchmarks spanning knowledge, reading comprehension, NLI, commonsense & mathematical reasoning, truthfulness, and regional knowledge. (2/5 ๐Ÿงต)
Women in AI Research - WiAIR
Women in AI Research - WiAIR
Women in AI Research - WiAIR
1mo
Neural MT metrics show the strongest alignment with downstream performance. But the proxy has limits: some specialized benchmarks, including MGSM and INCLUDE, show weaker or more variable correlations. Task-specific evaluation remains necessary. (4/5 ๐Ÿงต)