But in most real-world searches, object appearance is uncertain, and strongly context dependent. The same object (an apple) evokes very different visual input in these two scenes, based on its distance from the observer, surrounding objects etc... and could require very different templates.