//
sign in
Post
by @danabra.mov
PostEmbed
by @danabra.mov
Record
by @jimpick.com
Record
by @atsui.org
+ new component
Post
So far, it seems like the system is shockingly robust, right? Unfortunately, this is an illusion. First row reconfirms [RT24]: with no SFT poison, it's very hard to poison the PPO'd model (needs >5% poison). But, with just a little SFT poison (0.5%), attack succeeds w/ 3%. 5/n
5d
Gautam Kamath