//
sign in
Post
by @danabra.mov
PostEmbed
by @danabra.mov
Record
by @jimpick.com
Record
by @atsui.org
+ new component
Post
We just released "German Commons", the largest openly-licensed German text dataset for LLM training: 154B tokens with clear usage rights for research and commercial use. huggingface.co/datasets/coral-nlp/german-commons
7mo
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
huggingface.co
coral-nlp/german-commons · Datasets at Hugging Face
Webis Group