Modelling Intonation in Speech: Communication as Efficient Embodied Dynamics

Description

The hallmark of speech communication is the ubiquitous variation of its prosodic characteristics such as speaking rate, stress patterns and intonation. The underlying durational, articulatory and tonal characteristics used to signal prominence and disambiguating syntactic structure are deployed and coordinated in highly language-specific ways.
The project aims to develop an embodied dynamical modeling platform capable of accounting for the intonational aspects of prosodically rich speech and the links between articulation and intonation. The proposed programme is based on an existing optimisation model of articulatory sequencing patterns. The coordination arises as an optimal solution that simultaneously satisfies complementary requirements of efficient and effective communication. The intonation is incorporated through a set of well motivated correlates of articulatory effort and perceptually relevant characteristics of pitch modulation. Intentional local adjustments of the demands on perceptual clarity elicit variations corresponding to the prominence and domain-boundary phenomena: the greater the premium on perceptual constraints, the longer, more precise the articulatory movements, accompanied by the greater pitch excursions. The language-specificity is in turn achieved by customisation of the premiums placed on perceptual relevance of the various aspects of speech articulation and intonation.
Combining intonational and articulatory aspects in a single model will provide a much needed platform for accounting for the known interactions between various aspects of prosody. Discrete phonological patterns emerge within the continuous dynamical platform in form of attractors in the optimality landscape. In addition, the results of the optimisation procedure can be interpreted in stochastic terms, providing fine-grained predictions of the model.
The development of the model entails a broad multi-disciplinary investigation of articulatory and auditory apparatuses and their quantifiable characteristics. To test the predictions and fine-tune the model's parameters, this work must inevitably include a series of targeted experiments. The most important experimental and evaluation platform will be developed in parallel with the theoretical model in the form of a speech synthesis system. This novel synthesis framework, grounded in functional characteristics of embodied speech interaction, is one of the deliverables of the project.
StatusFinished
Effective start/end date01/09/201331/08/2016

Fields of Science

  • 6161 Phonetics
  • 6162 Cognitive science