From image to text to speech: The effects of speech prosody on information sequencing in audio description

Research output: Contribution to journalArticleScientificpeer-review

Abstract

Given the extensive body of research in audio description – the verbal-vocal description of visual or audiovisual content for visually impaired audiences – it is striking how little attention has been paid thus far to the spoken dimension of audio description and its para-linguistic, prosodic aspects. This article complements the previous research into how audio description speech is received by the partially sighted audiences by analyzing how it is performed vocally. We study the audio description of pictorial art, and one aspect of prosody is examined in detail: pitch, and the segmentation of information in relation to it. We analyze this relation in a corpus of audio described pictorial art in Finnish by combining phonetic measurements of the pitch with discourse analysis of the information segmentation. Previous studies have already shown that a sentence-initial high pitch acts as a discourse-structuring device in interpreting. Our study shows that the same applies to audio description. In addition, our study suggests that there is a relationship between the scale in the rise of pitch and the scale of the topical transition. That is, when the topical transition is clear, the rise of pitch level between the beginnings of two consecutive spoken sentences is large. Analogically, when the topical transition is small, the change of the sentence-initial pitch level is also rather small.
Original languageEnglish
JournalText & Talk
Volume41
Issue number3
Pages (from-to)309-334
Number of pages26
ISSN1860-7330
DOIs
Publication statusPublished - Feb 2021
MoE publication typeA1 Journal article-refereed

Fields of Science

  • 6121 Languages
  • intonation
  • prosody
  • audio description
  • information sequencing
  • pitch
  • art
  • paragraph intonation
  • speech paragraph

Cite this