at://
/
app.bsky.feed.post
/
3lrf5wci3fr2z
sign in
All
4
Record
2
Post
1
PostEmbed
1
Post
by @danabra.mov
PostEmbed
by @danabra.mov
Record
by @jimpick.com
Record
by @atsui.org
+ new component
Post
Our very comprehensive evaluations show: ā Significant improvement on harmful refusal accuracy compared to the abliterated and instruct (IT) models (Table 1) ā Minimal compromise on benign compliance & general abilities (see Table 2 in the text).
Jun 12, 2025
Mickel Liu