//
sign in
Profile
by @danabra.mov
Profile
by @dansshadow.bsky.social
Profile
by @jimpick.com
AviHandle
by @danabra.mov
AviHandle
by @dansshadow.bsky.social
AviHandle
by @katherine.computer
EventsList
by @katherine.computer
ProfileHeader
by @dansshadow.bsky.social
ProfileHeader
by @danabra.mov
ProfileMedia
by @danabra.mov
ProfilePlays
by @danabra.mov
ProfilePosts
by @danabra.mov
ProfilePosts
by @dansshadow.bsky.social
ProfileReplies
by @danabra.mov
Record
by @atsui.org
Skircle
by @danabra.mov
StreamPlacePlaylist
by @katherine.computer
+ new component
Profile
Loading...









Loading...
I felt honoured and was very happy to get an Outstanding Contribution in NLG award from @siggen.bsky.social !
Took lots of noce pictures at my Retroeval retirement symposium. One of my favourites was me with my first and last PhD students, Sandra Williams and Yujun Wang!
Someone asked me what were the highlights of my career, I responded with a list of papers which I was proud of. I did not mention grants, awards, jobs, etc. I know some people are proudest of their grants (etc), but for me it was always scientific outputs.
I'm looking forward to my retirement workshop, which starts on Monday 1 June!! Will be great to catch up with former students and colleagues, and also discuss NLG evaluation. retroeval.github.io
I remember seeing very dubious advice from OpenAI a few years ago on evaluation. So I was happy to see quite sensible recent advice from Anthropic on evaluation www.anthropic.com/engineering/...
10d
Really interesting scoping review that points out numerous flaws in LLM-as-Judge evaluation in healthcare, including minimal human oversight, absent bias testing, model monoculture, ignore implicit eval components, no check for consistency over time (etc) arxiv.org/abs/2604.25933
I wrote paper on "NLG Evaluation: Past, Present, Future" for Retroeval. Eval has changed enornously over my career! In future, I expect more on stuff relevant to real-world usage, including impact, qualitative studies, safety in worst/adversarial case arxiv.org/abs/2605.23715
New blog: I am worried by NLP research culture NLG and NLP are mostly much better in 2026 than when I got my PhD in 1990. Unfortunately research culture has gotten *worse” in this period, which really worries me as I retire. ehudreiter.com/2026/06/08/n...
On Tuesday at the #RetroEval symposium the SIGGEN board awarded Ehud the Inaugural Prize for Outstanding Contributions to Natural Language Generation, future editions of which will bear his name as the Ehud Reiter Prize for Outstanding Contributions to Natural Language Generation. Thank you, Ehud!
We had a fantastic time last week discussing the current challenges in NLG evaluation and celebrating the career of @ehudreiter.bsky.social. Pictures and a few video clips are now available: retroeval.github.io/_pages/media I would like to thank @uniofaberdeen.bsky.social for their support.
11d
19d
16d
18d
19d
20d
6d
10d
5d
In most ways NLG and NLP are much better in 2026 than when I got my PhD in 1990. Unfortunately research culture has gotten *worse” in this period, which really worries me as I retire. We have…
ehudreiter.com
I am worried by NLP research culture
SIGGEN & INLG
Saad Mahamood