Visual object recognition is an extremely difficult computational problem, yet the primate brain has developed mechanisms that perform robust recognition. The fact that primate recognition abilities far surpass those of the best artificial recognition systems makes it clear how little we know about the brain mechanisms that underlie recognition, and one of the fundamental goals of neuroscience is an understanding of these mechanisms. This knowledge is critical to a deep understanding of human visual perception and long-term memory, and is needed to meaningfully repair the disruption of these brain processes or to create prosthetics that may stand in for such disruption.

Current evidence indicates that ventral visual stream and its highest cortical -- the inferotemporal cortex (IT) -- are critical for visual object recognition, and thus central to understanding the underlying neuronal mechanisms. In non-human primates, IT lesions and temporary disruptions of neuronal activity produce specific deficits in the recognition of complex objects. It is also well known that single IT neuronal responses are highly sensitive to object shape, and can be highly selective for particular classes of objects such as faces, or for particular, well-learned objects within a class. Indeed, the shape selectivity of IT neuronal responses is often pointed to as the key neuronal property needed to support such behavior. However, although shape selectivity is required for recognition, it is not sufficient for robust, real-world recognition.  This is because, in the real world, each object can produce an essentially infinite number of images on the retina. This results from, for example, variability in an object’s position, size and pose, the illumination conditions, and other objects present in the scene.  Because of this variability, the key computational challenge of object recognition is not creation of shape selectivity alone (computationally simple), but the creation of shape selectivity that tolerates changes in object position, size, pose, illumination, and scene clutter (computationally very difficult).   The goal of this session is to review this key computational problem, outline mechanisms that might be used by the brain to solve it, review existing neurophysiological data that can be brought to bear, and outline some research that is aimed at tackling this key issues in neuroscience.