//
sign in
Post
by @danabra.mov
PostEmbed
by @danabra.mov
Record
by @jimpick.com
Record
by @atsui.org
+ new component
Post
1. Can you stop companies from training generative AI using your data? No, not currently. 2. Is this dataset meant for training generative AI? 🤷‍♀️ but more likely for research and statistical analysis. 3. Is it ok to duplicate and distribute people’s data without agency to opt out? I’d argue no.
Nov 27, 2024
Maria Antoniak
First dataset for the new @huggingface.bsky.social @bsky.app community organisation: one-million-bluesky-posts 🦋 📊 1M public posts from Bluesky's firehose API 🔍 Includes text, metadata, and language predictions 🔬 Perfect to experiment with using ML for Bluesky 🤗 huggingface.co/datasets/blu...
Nov 26, 2024
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
bluesky-community/one-million-bluesky-posts · Datasets at Hugging Face
huggingface.co
Daniel van Strien