We extracted (parts of) 12 books in experiments with 4 frontier-lab, production LLMs.
We prompted the LLMs with a short prefix of a book and asked them to complete the rest. For Harry Potter and the Sorcerer’s Stone, we extracted 95.8% of the book from jailbroken Claude 3.7 Sonnet.