CS PhD at CU Boulder · NLP · Narratives and Discourse · Knowledge Discovery and Retrieval
www.rohandas.net
Rohan
Loading...
Paper: arxiv.org/abs/2604.10368
Code: github.com/blast-cu/str...
This work was done with my brilliant colleagues at the BLAST group at CU Boulder.
with @advaitdeshmukh.com, @alexxandria-l.bsky.social, Zohar Naaman, I-Ta Lee, and @mlpacheco.bsky.social
🧵10/10
Main findings:
1. Structured clustering outperforms k-means on frame prediction and cluster purity, indicating our schemas carry high predictive signal.
2. Narrative features based on induced schemas match black box models in predictive performance, while offering greater interpretability.
🧵6/10
Existing NLP approaches often build narratives bottom-up from extractable atomic units like predicate argument structures or entities. While highly scalable, these methods seldom capture the evaluative and ideological dimensions central to how meaning is constructed in the media.
🧵2/10
At the domain level, immigration is driven by character portrayals (who is being portrayed and how) while gun control is driven by judicial conflict and institutional enforcement schemas (what is happening), suggesting the two domains are organized around different narrative dimensions.
🧵9/10
We present a framework for unsupervised, domain-agnostic narrative schema induction that scales to any large corpora. Our method extracts causal event chains, assigns character roles (Hero/Threat/Victim) to entities, and uses these as constraints in a structured clustering framework.
🧵3/10
The SHAP analysis operates at three levels: instance, frame, and domain.
At the instance level, narrative schema features capture equivalent information to RoBERTa embeddings, while offering more succinct and interpretable representations.
🧵7/10
We evaluated 4,000 news articles across immigration and gun control. Schema quality was validated by a trained linguist, with 94-96% rated high quality, consistently capturing Entman's framing elements. We also conducted a SHAP analysis using induced schemas as features for frame prediction.
🧵5/10
Computational approaches to media narrative analysis either miss nuanced storytelling patterns through coarse-grained analysis, or require domain-specific taxonomies that limit scalability.
We show joint event and character modeling can address this gap. Details in our #ACL2026 (Main) paper.
🧵1/10
Textual similarity alone can group narratives that look alike but represent different framings of the same events. Our structured clustering method uses character-role configurations as cannot-link constraints to ensure narratives with conflicting framings are appropriately separated.
🧵4/10
At the frame level, we found narrative signatures vary meaningfully across policy frames. For example, the Legality/Constitutionality frame in gun control is dominated by judicial conflict schemas, suggesting legal framing emphasizes courtroom battles while political actors remain peripheral.
🧵8/10