//
sign in
Post
by @danabra.mov
PostEmbed
by @danabra.mov
Record
by @jimpick.com
Record
by @atsui.org
+ new component
Post
They're not really reversing anything as much as they're making their system straight up refuse with ML tasks. But the problem is that Anthropic showed their hand. They showed that they can use steering vectors to degrade outputs on specific subjects.
6h
Key 🗝 🦊✅