Oh also today I had to accept the terms of use for data sharing for *my car* before I could use the satnav screen. Who thought this was the solution and when can we throw tomatoes at them?
LLMs used in the generation of this thread:
Model: not telling you
Version: probably changed while I was writing this
Prompt: "make it sound less like LinkedIn"
This is partly because the technology is moving so quickly, and partly because there has not yet been much agreement on what researchers actually need to report when they use these models.
Which model? Which version? What prompts? What settings? What checks?
Very pleased to have been on the leadership team for this paper!
LLMs are already being used all over behavioural science. But it is often pretty hard to work out exactly what has been done, and therefore how much confidence to place in the results.
www.nature.com/articles/s41...
A lot of people put a huge amount of thought into this, and it was a real pleasure working with such a great group.
Hopefully useful for anyone designing, reviewing or reading research involving LLMs.
Website 🔗: llm-checklist.com