Do the labels actually matter so much? I think we can all agree that being able to do a new thing in the moment, in the lesson when it has just been taught is distinct from being able to exercise that same skill in a different context at a later date and that it's the latter ability we should test.
Brett Handley
Soderstrom & Bjork made a sound distinction with problematic labels:
ā Measures of learning during teaching are unreliable.
Call this "performance".
ā Measures of learning post-teaching are more reliable.
Call this "learning".
This is a methodological definition.