📚 Researcher • 💻 Developer • 🇪🇺 European
PhD student for health-related information retrieval at @uni-jena.de × @webis.de
Jan Heinrich Merker
Loading...
Decades from now, the Covid-19 pandemic will be visible in the historical data of nearly anything measurable today. Here’s an incomplete collection of charts that capture that break — across the economy, health care, education, work, family life and more.
Happy to share that our paper "The Viability of Crowdsourcing for RAG Evaluation" received the Best Paper Honourable Mention at #SIGIR2025! Very grateful to the community for recognizing our work on improving RAG evaluation.
📄 webis.de/publications...
Honored to win the ICTIR Best Paper Honorable Mention Award for "Axioms for Retrieval-Augmented Generation"!
Our new axioms are integrated with ir_axioms: github.com/webis-de/ir_...
Nice to see axiomatic IR gaining momentum.
We presented two papers at ICTIR 2025 today:
- Axioms for Retrieval-Augmented Generation webis.de/publications...
- Learning Effective Representations for Retrieval Using Self-Distillation with Adaptive Relevance Margins webis.de/publications...
We just released "German Commons", the largest openly-licensed German text dataset for LLM training: 154B tokens with clear usage rights for research and commercial use.
huggingface.co/datasets/coral-nlp/german-commons
I've already collected my first two stickers for the @ecir2026.eu Collab-a-thon. If you're in Delft today, don't muss out the first collab-a-thon session at 4pm in LAB.115 🤝
#collab-a-thon #collaboration #research #ecir
🚨 New Pre-Print! 🚨 Reviewer 2 has once again asked for DL’19, what can you say in rebuttal? To help, we have re-annotated DL’19. Work done with @maik_froebe.bsky.social, @hscells.bsky.social, @fschlatt1.bsky.social, Guglielmo Faggioli, Saber Zerhoudi, @macavaney.bsky.social, Eugene Yang 🧵
Lets replace search with "AI" then! Totally logical if you ask me. Even more worth it when you know they're exponentially overtaking the airline industry in their carbon footprint.
Study: www.cjr.org/tow_center/w...
What a team of keynote speakers. I must confess seeing that Steve Robertson will be there is a thrill. One of the legends of information retrieval reflecting on the field. #sigir2025
sigir2025.dei.unipd.it/keynote-spea...
The New York Times
It can be easy to forget, or look away from, the pain and disruption of the pandemic. The numbers will be there to remind us.
New preprint of WSDM demo by @maik_froebe @matthias and Ferdinand Schlatt
Lightning IR: Straightforward Fine-tuning and Inference of Transformer-based Language Models for Information Retrieval https://arxiv.org/abs/2411.04677
https://webis.de/lightning-ir/
Webis Group
Webis Group
Webis Group
Webis Group
Jan Heinrich Merker
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
The SIGIR 2025 keynotes are held by esteemed speakers: Robertson S., Gurevych I. and Frieder O., who will cover topics that range from AI in medical search and ecommendation to BM25 and probabilistic ...
A wide range of transformer-based language models have been proposed for information retrieval tasks. However, including transformer-based models in retrieval pipelines is often complex and requires substantial engineering effort. In this paper, we introduce Lightning IR, an easy-to-use PyTorch Lightning-based framework for applying transformer-based language models in retrieval scenarios. Lightning IR provides a modular and extensible architecture that supports all stages of a retrieval pipeline: from fine-tuning and indexing to searching and re-ranking. Designed to be scalable and reproducible, Lightning IR is available as open-source: https://github.com/webis-de/lightning-ir.