Inlay

The method in this paper was designed to find an optimal data mixture. But researchers in the human sciences who are training models *in order to understand the effect of the data* might also consider this as a clever way of evaluating hundreds of subsets without training hundreds of models. #MLSky

Building/customizing your own LLM? You'll want to curate training data for it, but how do you know what makes the data good? You can try out recipes👩‍🍳 iterate on ✨vibes✨ but we can't actually test all possible combos of tweaks,,, right?? 🙅‍♂️WRONG! arxiv.org/abs/2410.15661 (1/n) 🧵