//
sign in
Post
by @danabra.mov
PostEmbed
by @danabra.mov
Record
by @jimpick.com
Record
by @atsui.org
+ new component
Post
Our very comprehensive evaluations show: āœ… Significant improvement on harmful refusal accuracy compared to the abliterated and instruct (IT) models (Table 1) āœ… Minimal compromise on benign compliance & general abilities (see Table 2 in the text).
Jun 12, 2025
Mickel Liu