CNS Speech Lab
Speech Production

The DIVA Model
Model Demos
Neural Correlates

Speech Perception

Sound Categorization
Audio-Visual Speech

Brain Imaging

Overview



Department of
Cognitive & Neural Systems

DIVA

DIVA (Directions Into Velocities of Articulators) is a neural network model of speech motor skill acquisition and speech production. In computer simulations, the model learns to control the movements of a computer-simulated vocal tract in order to produce sequences of phoneme strings. The model's neural mappings are tuned during a babbling phase in which auditory feedback from self-generated speech sounds is used to learn the relationship between motor actions and their acoustic consequences. After learning, the model can produce arbitrary combinations of phonemes, even in the presence of constraints on the articulators.

DIVA provides unified explanations for a number of long-studied speech production phenomena including motor equivalence, contextual variability, speaking rate effects, anticipatory coarticulation, and carryover coarticulation. The model is schematized in the figure below. Each block in the diagram corresponds to a hypothesized set of neurons in the human speech system.


 Figure 1: Schmatic of the DIVA Model

The first set of synaptic weights (the filled semi-circle between the speech sound map and the planning direction vector) encodes auditory and orosensory targets for phonemes learned during babbling. The learned speech sound targets take the form of multi-dimensional regions, rather than points, in auditory and orosensory spaces. The second set of weights (labeled directional mapping transform the desired movement direction in auditory space into movement directions in articulator space. The problem of mapping from auditory or orosensory space to articulator space is ill-posed, and the model's mapping is related to the Moore-Penrose pseudoinverse of the Jacobian matrix relating the auditory, somatosensory and articulatory spaces. The final mapping, labeled forward model transforms orosensory feedback from the vocal tract and an efference copy of the motor commands into an auditory representation that can be used to control speech movements without relying on auditory feedback.

DIVA Papers

DIVA Demos
Hypothesized Neural Correlates of DIVA