Inlay

Scaling LLM Reasoning with EGGROLL 🥚🧠📝 Using 🥚 to finetune RWKV-7 language models outperforms GRPO on Countdown and GSM8K ❗ 🥚significantly outperformed GRPO on the Countdown task, achieving a 35% validation accuracy compared to GRPO's 23%❗