This is a very cool work! Only saw it now.
They could extract almost the entire (!) The Great Gastsby and 1984 from Claude (with some jailbreaking). But for some reason not Catch-22. I wonder why that is and what it tells us about the training data (or about Catch-22?). Catch-22 is great btw.
Dmitry Kobak
We extracted (parts of) 12 books in experiments with 4 frontier-lab, production LLMs.
We prompted the LLMs with a short prefix of a book and asked them to complete the rest. For Harry Potter and the Sorcerer’s Stone, we extracted 95.8% of the book from jailbroken Claude 3.7 Sonnet.