//
sign in
Profile
by @danabra.mov
Profile
by @dansshadow.bsky.social
Profile
by @jimpick.com
AviHandle
by @danabra.mov
AviHandle
by @dansshadow.bsky.social
AviHandle
by @katherine.computer
EventsList
by @katherine.computer
ProfileHeader
by @dansshadow.bsky.social
ProfileHeader
by @danabra.mov
ProfileMedia
by @danabra.mov
ProfilePlays
by @danabra.mov
ProfilePosts
by @danabra.mov
ProfilePosts
by @dansshadow.bsky.social
ProfileReplies
by @danabra.mov
Record
by @atsui.org
Skircle
by @danabra.mov
StreamPlacePlaylist
by @katherine.computer
+ new component
Profile
Loading...
learning @IFM_MBZUAI, Silicon Valley Lab // 🀘Ph.D. @UTAustin
Hongli Zhan









Loading...
What does a scientific figure make you wonder? πŸ“Š We introduce MQUD: multimodal Questions Under Discussion for scientific figures. With 1,250 author-annotated questions over 245 figures from 56 papers, MQUD asks what scientific question a figure raises in context.
1mo
In multi-turn conversation, LLMs tend to repeat the same kind of things over and over again. They could have different words, but we found them to be the *same discourse moves*! Introducing @hongli-zhan.bsky.social’s new work: novel discourse-level diversity rewards in post-training:
[7/7] This is the last paper of my PhD at UT Austin, wrapping up 5 years of work on emotionally intelligent AI. Huge thanks to my advisor @jessyjli.bsky.social, and co-authors Emma Gueorguieva, Javier Hernandez, Jina Suh, and @desmond-ong.bsky.social!
1mo
Yating Wu
1mo
[2/7] Ask ChatGPT to comfort someone 10 times. You'll notice it always does the same moves: reflect, validate, suggest. Human counselors don't do this. They adapt -- sometimes they challenge, sometimes they stay quiet, sometimes they share a story.
[6/7] Models: huggingface.co/hongli-zhan/... huggingface.co/hongli-zhan/... Code and data: github.com/honglizhan/m...
[3/7] We call this "tactic stickiness" -- when a model locks onto the same empathic moves turn after turn. We formalize it and find: LLMs reuse the same tactic sequences FAR more than human supporters. And standard metrics (BLEU, BERTScore) completely miss it.
[4/7] Our fix: MINT (Multi-turn Inter-tactic Novelty Training). We use GRPO to reward models for diversifying their support tactics across turns, without sacrificing empathy quality.
New paper! 🏁 Last one from my PhD at UT Austin. LLMs sound empathic but repeat the same discourse moves turn after turn β€” at 2x the rate of humans. We built MINT🌿, the first RL framework for discourse move diversity in empathic dialogue. +25% empathy, βˆ’26% repetition. πŸ“„ arxiv.org/abs/2604.11742
[5/7] Results: MINT improves empathy by 25% while reducing tactic repetition by 26%. A 4B model trained with MINT surpasses all baselines, including quality-only RL and token-level diversity methods. You need discourse-level signals, not just token-level diversity.
1k+ downloads each on the MINT empathy models since release πŸ”₯ Encouraging to see the interest in our work! tl;dr: In multi-turn empathic dialogue, LLMs reuse the same discourse moves far more often than humans do; MINT uses RL to diversify them. Give it a try!πŸ‘‡ huggingface.co/hongli-zhan/...
1mo
Jessy Li
1mo
1mo
1mo
1mo
1mo
1mo
Hongli Zhan