Chasing digital badness. Senior Researcher at Citizen Lab, but words here are mine.
John Scott-Railton
Loading...
NEW: malware developers added nuclear & biological weapons text to to their spyware.
Goal? To trigger LLM safety refusals... so that their spyware wouldn't be analyzed by an AI security scanner.
Cleanest practical example I can think of for why over-indexing on first order "safety" is risky. 1/
John Scott-Railton
Does anyone want to check this against Fable 5 and tell me what they get?
I've already seen one refusal..