CN 730 Models of Visual Perception

Course Description

Prerequisites : Consent of the instructor, Ennio Mingolla  (Office hours: Tuesdays, 1-3 pm)

The 2001 edition of this course offers an advanced survey of selected topics of current interest in the neural and computational modeling of mammalian vision. This year's topics include motion perception, eye movements, attention and figure-ground phenomena. Several classes will be held at laboratories of nearby institutions. Students are expected to have a sufficient interdisciplinary grounding in the fundamentals of mammalian vision to read primary research sources extensively, and will be required to present short oral critiques of selected readings to the class. A term project that combines a problem statement, literature review, and either (1) simulation of a model or (2) a design for a pyschophysical experiment is also required.

FREQUENTLY-ASKED QUESTIONS about CN 730

Information for GUEST SPEAKERS

Dates of DELIVERABLES for student research reports

Weekly Schedule -- NOTE: Meetings are on Wednesdays, beginning on January 17, and start at 1:00 PM, unless otherwise indicated in a particular week's entry. Meetings at Boston University are held in Room B02, unless otherwise indicated.

Click on a date to go directly to a summary of that week's class, including assigned readings


Jan 31 -- Chris Pack -- Watching single MT neurons solve the aperture problem

Abstract: The brain constructs a representation of the visual world from
the outputs of neurons with very small receptive fields.  This invariably
introduces errors into the measurements of the most basic visual quantities,
and the resulting confusion is often called the aperture problem.  I will
talk about how area MT in the macaque resolves the aperture problem for
moving stimuli.

Background:

Albright TD, Stoner GR (1995)  Visual motion perception.   Proc Natl Acad
Sci U S A 1995 Mar 28;92(7):2433-40

Allman, J. M., F. Miezin, and E. McGuinness (1985) Direction- and
velocity-specific responses from beyond the classical receptive field in the
middle temporal visual area (MT). Perception 14:105-26.

Core:

Albright TD.  Direction and orientation selectivity of neurons in visual
area MT of the macaque.  Journal of Neurophysiology, 1984; 52(6):1106-30.

Lorenceau J, Shiffrar M, Wells N, Castet E.   Different motion sensitive
units are involved in recovering the direction of moving lines. Vision
Research, 1993; 33(9):1207-17.

Pack CC, Born RT.  Temporal dynamics of a neural solution to the aperture
problem in macaque visual area MT.  Nature, in press.

Born, R. T., J. M. Groh, R. Zhao, and S. L. Lukasewycz (2000) Segregation of
object and background motion in visual area MT: effects of microstimulation
on eye movements.  Neuron. 2000 Jun;26(3):725-34. Download pdf file.

Supplementary:

Britten, K. H., W. T. Newsome, M. N. Shadlen, S. Celebrini, and J. A.
Movshon (1996) A relationship between behavioral choice and the visual
responses of neurons in macaque MT. Vis. Neurosci. 13:87-100.

Newsome, W T, R H Wurtz, M R Dursteler, and A Mikami (1985) Deficits in
visual motion processing following ibotenic acid lesions of the middle
temporal visual area of the macaque monkey. J. Neurosci. 5:825-40.

Masson GS, Rybarczyk Y, Castet E, Mestre DR (2000)  Temporal dynamics of
motion integration for the initiation of tracking eye movements at
ultra-short latencies.  Visual Neuroscience.

Back to weekly schedule



Feb 7 -- Takeo Watanabe

Watanabe, T. & Miyauchi, S. (1998). Role of attention and form in visual motion processing: Psychophysical and brain imaging studies. In High-level motion processing-Computational, neurobiological and psychophysical perspectives (Ed. Takeo Watanabe), MIT Press, pp95-114.

Watanabe, T. et al (1998). Attention-dependent differential activation within the motion pathway. Proceedings of the National Academy of Science of USA, 95, 11489-11492. download pdf

Watanabe, T. et al (1998). Attention-regulated activity in human primary visual cortex. Journal of Neurophysiology, 79, 2218-2221. download pdf

Feb 7 -- David Somers

Sereno, M. I., Dale, A. M., Reppas, J. B., Kwong, K. K., Belliveau, J. W., Brady, T. J., Rosen, B. R., & Tootell, R. B. (1995). Borders of multiple visual areas in humans revealed by functional magnetic resonance imaging. Science, 268(5212), 889-893.

Somers, D.C., Dale, A.M., Seiffert, A.E., Tootell, R.B.H.
Functional MRI Reveals Spatially Specific Attentional Modulation in Human Primary Visual Cortex download pdf Proc. Nat'l Acad. Sci. (USA), 96, 1663-1668, 1999.
See also: PNAS Commentary by Posner & Gilbert download pdf

Supplementary articles suggested by students:

Montero VM, (2000), Attentional activation of the
visual thalamic reticular nucleus depends on
'top-down' inputs from the primary visual cortex via
corticogeniculate pathways, Brain Res, May download pdf
2;864(1):95-104.

Tootell RBH, Hadjikhani NK, Mendola JD, et al.
From retinotopy to recognition: fMRI in human visual cortex
TRENDS COGN SCI 2: (5) 174-183 MAY 1998 download pdf

Back to weekly schedule



Feb 14 -- Martin Giese
 

LEARNING-BASED NEURAL REPRESENTATIONS OF BIOLOGICAL MOTION

The human visual system has an astonishing capability for the
analysis of complex biological motion stimuli.
The underlying neural mechanisms are still  largely unknown.
A neural model is presented that is consistent with the
facts know from the neurophysiology of the ventral and dorsal
pathway that reproduces a variety of psychophysical
and neurophysiological results on biological motion perception.
The model is based on the assumption that complex movement patterns
are  encoded on the basis of learned prototypical example patterns.
This assumption is analogous to the representation of the
shape of 3D-objects by learned 2D-prototypical views in the
ventral pathway, that is strongly supported by recent
psychophysical and neurophysiological evidence  The model makes a
number of predictions that can be tested psychophysically,
neurophysiologically, and using FMRI methods.

The assumption of a representation of articulated movement patterns
by learned prototypical examples can also be exploited in the domain
of computer vision. The second part of the talk treats a new
method that allows to define morphable models for spatio-temporal
patterns by linear combination of prototypical example
movements. It is demonstrated that this method has
a broad application spectrum. One application in the field of
computer graphics is the synthesis of new movement patterns
by motion morphing. Other applications in the field of computer vision
are the classification of movement patterns, and in particular
the estimation of parameters that characterize the style
of complex movements.

Main readings:

Giese, M.A. Neural Model for the Recognition of Biological Motion. In:
Dynamische Perzeption, G. Baratoff and H. Neumann (eds.), Infix
Verlag, Berlin, 105-110, 2000. download PostScript file

Giese, M.A. and T. Poggio. Morphable Models for the Analysis and Synthesis of
Complex Motion Pattern, International Journal of Computer
Vision, 38, 1, 59-73, 2000. download g'zipped PostScript file
 

Additional readings:

Giese, M.A. Dynamic Neural Field Theory of Motion Perception, Kluwer Academic
Publishers, Dordrecht, Netherlands, 1999.

Johansson, G. (1973) Visual perception and a model for its analysis.
Perception and Psychophysics, 14, 201-211.

Perrett DI, Smith PA, Mistlin AJ, Chitty AJ, Head AS, Potter DD,
Broennimann R, Milner AD, Jeeves MA (1985) Visual analysis of body
movements by neurones in the temporal cortex of the macaque monkey: a
preliminary report. Behav Brain Res. 1985 Aug;16(2-3):153-70.

Pinto, J. and Shiffrar, M. (1999) Subconfigurations in the human form
in the perception of biological motion displays. Acta Psychologica,
102, 293-318. download pdf

Riesenhuber, M. and T. Poggio. Hierarchical Models of Object Recognition in
Cortex, Nature Neuroscience, 2, 1019-1025, 1999. download pdf

Riesenhuber, M., and T. Poggio. Models of Object Recognition, Nature
Neuroscience, 3 Supp., 1199-1204, 2000. download pdf
 

Back to weekly schedule



Feb 28 -- Bill Freeman

Hour 1:
-----------------------------------------------------------------
"How to Tell Shading from Paint"
 Bill Freeman
 Mitsubishi Electric Research Labs (MERL)
 

Abstract:

  When people study a picture, they can judge whether it depicts a
  shaded, 3-dimensional surface, or simply a flat surface with markings
  or paint on it.  This task--distinguishing shading from
  paint--is essential for interpreting images.  We seek to get a
  computer to make the same judgements.  We use as "ground truth" a
  database of pictures that human subjects had labelled according to
  their "shadedness" (from Freeman and Viola '98).

  We use a machine learning approach.  We generate a training set of
  synthetic examples of images that are either caused by shading or
  paint, from which we derive probabilistic interpretations for a given
  local patch of image.  We use a Markov network to model the images and
  underlying scenes, and use Bayesian belief propagation to efficiently
  propagate the local probabilistic evidence across the image.

  The machine learning approach focusses attention on representations.
  We contrast two different approaches.  One uses a pixel-based image
  representation and solves for the shape and reflectance at each
  position.  The second approach represents image data by a cascaded
  energy model, and represents the scene only by a label for the cause
  of the image information at each position, scale, and orientation of a
  steerable pyramid.  We compare the methods, and show results from
  each approach.

Joint work with Egon Pasztor (MIT Media Lab) and Matt Bell (Stanford).

References:

Bell and Freeman, Learning local evidence for shading and reflectance,
   http://www.merl.com/reports/TR2001-04/

Freeman, Pasztor, and Carmichael, Learning Low-Level Vision,
   Intl. J. Computer Vision, October, 2000.
   http://www.merl.com/reports/TR2000-05/

Freeman and Viola, Bayesian model of surface perception, NIPS 1998
   http://www.merl.com/reports/TR98-05/
 
 

Hour 2:
-----------------------------------------------------------------
Baback Moghaddam <baback@merl.com>

Title: Gender Classification with Support Vector Machines (SVMs)

Pointers:

"Gender Classification with Support Vector Machines," Moghaddam B. and
Yang M-H., in Proceedings of the 4th IEEE Int'll Conf. on Face and
Gesture Recognition, FG2000, Grenoble, France, March 2000.
http://www.merl.com/reports/TR2000-01/index.html

C. J. C. Burges. A Tutorial on Support Vector Machines for Pattern
Recognition. Knowledge Discovery and Data Mining, 2(2), 1998.
http://www.kernel-machines.org/papers/Burges98.ps.gz

SVM home page:
http://svm.first.gmd.de/
 

Back to weekly schedule



Mar 14 -- Margrit Betke

Chest computed tomography (CT) has become a well-established means of
diagnosing pulmonary metastasis of oncology patients and evaluating
response to treatment regimens.  Since diagnosis and prognosis of
cancer generally depend upon growth assessment, repeat CT studies are
used to determine growth rates of pulmonary nodules.

The long-term objective of our project is to offer the radiologist a
fully automated computer vision system that detects and compares
pulmonary nodules in repeat CT studies.  Such a system would provide a
quantitative and efficient tool for the radiologist to analyze CT
scans and may therefore indirectly impact patients' treatment regimen.

I will describe a prototype system for the analysis of pulmonary
nodule location, shape, and volumetric growth and then present some
recent results on automatic image registration.

Reading:

J. P. Ko and M. Betke, "Chest CT: Automated Nodule Detection and
Assessment of Change over Time-Preliminary Experience." Radiology, 218,
267-273, January 2001. download pdf
 

Back to weekly schedule



Mar 21 -- Gary Blasdel

Readings:

1. Blasdel GG. 1997. Strategies of visual perception suggested by
optically imaged patterns of functional architecture in monkey visual
cortex. In: Imaging Brain Structure and Function (Lester DS, Felder CC,
and Lewis EN, ed.). Ann. N.Y. Acad. Sci. 820: 170-195. download pdf

2. Obermayer K, and Blasdel GG. 1993. Geometry of orientation and ocular
dominance columns in monkey striate cortex. J. Neuroscience. 13(10):
4114-4129. download pdf

3. Blasdel GG. 1992. Differential imaging of ocular dominance columns
and orientation selectivity in monkey striate cortex. J. Neuroscience.
12(8): 3115-3138.

4. Blasdel GG, Campbell D. Functional retinotopy of monkey visual
cortex. J. Neurosci. 2000.

5. Blasdel GG, Salama G. Voltage-sensitive dyes reveal a modular
organization in monkey striate cortex. Nature. 321:579-585. 1986.
 

Back to weekly schedule



Mar 28 -- John Assad

Readings:

Assad, J.A. and Maunsell, J. H. Neuronal correlates of inferred motion in primate posterior parietal cortex. Nature 373: 518-521 (1995).

Eskandar, E. N. and Assad, J.A. Dissociation of visual, motor and predictive signals in parietal cortex during visual guidance Nature Neuroscience2:88-93 (1999). download pdf

Back to weekly schedule



Apr 4 -- Ken Nakayama

Nakayama, K., He, Z. J., and Shimojo, S. (1995). Visual surface representation: A critical link between lower-level and higher-level vision. In S. M. Kosslyn and D. N Osherson, Eds., Visual cognition. Cambridge, MA: MIT Press, 1995.

Nakayama, K. and Shimojo, S. (1992). Experiencing and perceiving visual surfaces. Science, 257, 1357-1363.

Duncan, R. O., Albright, T. D., & Stoner, G. R. (2000). Occlusion and the interpretation of visual motion: perceptual and neuronal effects of context. J Neurosci, 20(15), 5885-5897. download pdf

Bakin, J. S., Nakayama, K., & Gilbert, C. D. (2000). Visual responses in monkey areas V1 and V2 to three-dimensional surface configurations. J Neurosci, 20(21), 8188-8198. download pdf

Zhou, H., Friedman, H. S. and von der Heydt, R.  Coding of Border Ownership in Monkey Visual Cortex. J Neurosci, 20(17):6594-6611. download pdf

Back to weekly schedule



Apr 11 -- NO CLASS

Back to weekly schedule



Apr 18 -- Allen Waxman   -- Multi-sensor 3D Image Fusion

Fay, D.A.,Waxman, A.M., Aguilar, M., Ireland, D.B., Racamato, J. P., Ross, W.D., Streilein, W.W., and Braun, M. I. Fusion of Multi-Sensor Imagery for Night Vision: Color Visualization, Target Learning and Search. Proceedings of the Third International Conference on Information Fusion. Paris, France, July 10-13, 2000.

Ross, W.D., Waxman, A.M., Streilein, W.W., Aguilar, M., Verly, J., Liu, F., Braun, M.I., Harmon, P., and Rak, S. Multi-Sensor 3D Image Fusion and Interactive Search. Proceedings of the Third International Conference on Information Fusion. Paris, France, July 10-13, 2000.

Streilein, W.W., Waxman, A.M., Ross, W.D., Liu, F., Braun, M.I., Fay, D.A., Harmon, P., and Read, C.H. Fused Multi-Sensor Image Mining for Feature Foundation Data. Proceedings of the Third International Conference on Information Fusion. Paris, France, July 10-13, 2000.

Back to weekly schedule



Apr 25 -- Peter Schiller

Schiller, P. (1998). The neural control of visually guided eye movements.
In John Richards (Ed.)  Cognitive Neuroscience of Attention. Hillsday, NJ: Earlbaum.

See also: Schiller lab web site, especially pages on eye movements.

Back to weekly schedule



 

Last Updated 2 April, 2001

This page is maintained by Ennio Mingolla

Please direct all queries and bug reports to: ennio@cns.bu.edu