Inlay

Profile

learning @IFM_MBZUAI, Silicon Valley Lab // 🤘Ph.D. @UTAustin

Hongli Zhan

What does a scientific figure make you wonder? 📊 We introduce MQUD: multimodal Questions Under Discussion for scientific figures. With 1,250 author-annotated questions over 245 figures from 56 papers, MQUD asks what scientific question a figure raises in context.

1mo

In multi-turn conversation, LLMs tend to repeat the same kind of things over and over again. They could have different words, but we found them to be the *same discourse moves*! Introducing @hongli-zhan.bsky.social’s new work: novel discourse-level diversity rewards in post-training:

[7/7] This is the last paper of my PhD at UT Austin, wrapping up 5 years of work on emotionally intelligent AI. Huge thanks to my advisor @jessyjli.bsky.social, and co-authors Emma Gueorguieva, Javier Hernandez, Jina Suh, and @desmond-ong.bsky.social!

1mo

Yating Wu

1mo

[2/7] Ask ChatGPT to comfort someone 10 times. You'll notice it always does the same moves: reflect, validate, suggest. Human counselors don't do this. They adapt -- sometimes they challenge, sometimes they stay quiet, sometimes they share a story.

[6/7] Models: huggingface.co/hongli-zhan/... huggingface.co/hongli-zhan/... Code and data: github.com/honglizhan/m...

[3/7] We call this "tactic stickiness" -- when a model locks onto the same empathic moves turn after turn. We formalize it and find: LLMs reuse the same tactic sequences FAR more than human supporters. And standard metrics (BLEU, BERTScore) completely miss it.

[4/7] Our fix: MINT (Multi-turn Inter-tactic Novelty Training). We use GRPO to reward models for diversifying their support tactics across turns, without sacrificing empathy quality.

New paper! 🏁 Last one from my PhD at UT Austin. LLMs sound empathic but repeat the same discourse moves turn after turn — at 2x the rate of humans. We built MINT🌿, the first RL framework for discourse move diversity in empathic dialogue. +25% empathy, −26% repetition. 📄 arxiv.org/abs/2604.11742

[5/7] Results: MINT improves empathy by 25% while reducing tactic repetition by 26%. A 4B model trained with MINT surpasses all baselines, including quality-only RL and token-level diversity methods. You need discourse-level signals, not just token-level diversity.

1k+ downloads each on the MINT empathy models since release 🔥 Encouraging to see the interest in our work! tl;dr: In multi-turn empathic dialogue, LLMs reuse the same discourse moves far more often than humans do; MINT uses RL to diversify them. Give it a try!👇 huggingface.co/hongli-zhan/...

1mo

Jessy Li

1mo

Hongli Zhan