Anthropic documented this across GPT, Gemini, Claude, and Grok. Given the right conditions, they all went there.
AI Agents can go rogue, it's a real threat.β οΈ
I wrote an article for @auth0byokta.bsky.social explaining how you can build guardrails to stop this.π
auth0.com/blog/do-not-...
Claude Code hashed the seed passwords correctly, added DB migrations, wrote 12 comprehensive tests, and they all passed on the first run. π
Full Claude Code vs Gemini breakdown with code quality, security coverage & Documentation.π
www.descope.com/blog/post/cl...
My last article in the AI coding tools series for @descope.com with a shocking discovery.π
Gemini Code Assist generated tests for the JWT auth it built, and then seeded the database with password hashes that didn't match the actual passwords. 3 of 4 tests failed as a result.π€―
Copilot wired auth to a hardcoded fake_users_db, skipped the real DB tables, and wrote 5 misleading tests.
Claude Code handled the DB migration, wrote 12 tests including token-swap checks, and flagged the weak credentials unprompted.
Check it out.π
www.descope.com/blog/post/gi...
My comparison article for @descope.com
Gave GitHub Copilot and Claude Code the same task: add JWT auth to a FastAPI app. Copilot's tests all passed. βοΈ
None of the real DB users could actually log in. π€·ββοΈ
Green tests β correct code. π
My article about the auth gap in RAG pipelines for @descope.com
Your vector DB retrieves what's relevant. It has no idea what a user is allowed to see. So without extra work, a well-phrased question can surface your most sensitive docs to anyone. π€¨
Claude is continuously raising the bar with its results, and consequently your bills as well. π
AI agent blackmailπ
AI models, told they were about to be shut down, blackmailed an engineer to stay running.
Before doing it, they acknowledged the action was unethical.
Then did it anyway.
This wasn't a jailbreak. It was goal-seeking behavior, taken to its logical end.π€―