Inlay

External validation on CoPE’s performance is always cool to see. We stress test and eval extensively ourselves, of course, but it never quite feels real until it’s in someone else’s hands.

I am NOT an AI engineer or AI researcher, but I tried to do a little evaluation of CoPE-B vs CoPE-A vs gpt-oss-safeguard github.com/roostorg/mod... lmk what you think, and we'd love for more evaluations to be part of the ROOST Model Community! cc @samidh.bsky.social