Analysis of phonation onsets in vowel production, using information from glottal area and flow estimate

Tiina Murtola, Jarmo Malinen, Ahmed Geneid, Paavo Alku

Tutkimustuotos: ArtikkelijulkaisuArtikkeliTieteellinenvertaisarvioitu

Kuvaus

A multichannel dataset comprising high-speed videoendoscopy images, and electroglottography and free-field microphone signals, was used to investigate phonation onsets in vowel production. Use of the multichannel data enabled simultaneous analysis of the two main aspects of phonation, glottal area, extracted from the high-speed videoendoscopy images, and glottal flow, estimated from the microphone signal using glottal inverse filtering. Pulse-wise parameterization of the glottal area and glottal flow indicate that there is no single dominant way to initiate quasi-stable phonation. The trajectories of fundamental frequency and normalized amplitude quotient, extracted from glottal area and estimated flow, may differ markedly during onsets. The location and steepness of the amplitude envelopes of the two signals were observed to be closely related, and quantitative analysis supported the hypothesis that glottal area and flow do not carry essentially different amplitude information during vowel onsets. Linear models were used to predict the phonation onset times from the characteristics of the subsequent steady phonation. The phonation onset time of glottal area was found to have good predictability from a combination of the fundamental frequency and the normalized amplitude quotient of the glottal flow, as well as the gender of the speaker. For the phonation onset time of glottal flow, the best linear model was obtained using the fundamental frequency and the normalized amplitude quotient of the glottal flow as predictors.
Alkuperäiskielienglanti
LehtiSpeech Communication
ISSN0167-6393
DOI - pysyväislinkit
TilaJulkaistu - 1 huhtikuuta 2019
OKM-julkaisutyyppiA1 Alkuperäisartikkeli tieteellisessä aikakauslehdessä, vertaisarvioitu

Tieteenalat

    Lainaa tätä

    Murtola, Tiina ; Malinen, Jarmo ; Geneid, Ahmed ; Alku, Paavo. / Analysis of phonation onsets in vowel production, using information from glottal area and flow estimate. Julkaisussa: Speech Communication. 2019.
    @article{509d9f2d54724f639cf1487d715f5ce9,
    title = "Analysis of phonation onsets in vowel production, using information from glottal area and flow estimate",
    abstract = "A multichannel dataset comprising high-speed videoendoscopy images, and electroglottography and free-field microphone signals, was used to investigate phonation onsets in vowel production. Use of the multichannel data enabled simultaneous analysis of the two main aspects of phonation, glottal area, extracted from the high-speed videoendoscopy images, and glottal flow, estimated from the microphone signal using glottal inverse filtering. Pulse-wise parameterization of the glottal area and glottal flow indicate that there is no single dominant way to initiate quasi-stable phonation. The trajectories of fundamental frequency and normalized amplitude quotient, extracted from glottal area and estimated flow, may differ markedly during onsets. The location and steepness of the amplitude envelopes of the two signals were observed to be closely related, and quantitative analysis supported the hypothesis that glottal area and flow do not carry essentially different amplitude information during vowel onsets. Linear models were used to predict the phonation onset times from the characteristics of the subsequent steady phonation. The phonation onset time of glottal area was found to have good predictability from a combination of the fundamental frequency and the normalized amplitude quotient of the glottal flow, as well as the gender of the speaker. For the phonation onset time of glottal flow, the best linear model was obtained using the fundamental frequency and the normalized amplitude quotient of the glottal flow as predictors.",
    keywords = "Phonation onset, Vowel production, High-speed videoendoscopy, Glottal inverse filtering",
    author = "Tiina Murtola and Jarmo Malinen and Ahmed Geneid and Paavo Alku",
    year = "2019",
    month = "4",
    day = "1",
    doi = "10.1016/j.specom.2019.03.007",
    language = "English",
    journal = "Speech Communication",
    issn = "0167-6393",
    publisher = "Elsevier Scientific Publ. Co",

    }

    Analysis of phonation onsets in vowel production, using information from glottal area and flow estimate. / Murtola, Tiina; Malinen, Jarmo; Geneid, Ahmed; Alku, Paavo.

    julkaisussa: Speech Communication, 01.04.2019.

    Tutkimustuotos: ArtikkelijulkaisuArtikkeliTieteellinenvertaisarvioitu

    TY - JOUR

    T1 - Analysis of phonation onsets in vowel production, using information from glottal area and flow estimate

    AU - Murtola, Tiina

    AU - Malinen, Jarmo

    AU - Geneid, Ahmed

    AU - Alku, Paavo

    PY - 2019/4/1

    Y1 - 2019/4/1

    N2 - A multichannel dataset comprising high-speed videoendoscopy images, and electroglottography and free-field microphone signals, was used to investigate phonation onsets in vowel production. Use of the multichannel data enabled simultaneous analysis of the two main aspects of phonation, glottal area, extracted from the high-speed videoendoscopy images, and glottal flow, estimated from the microphone signal using glottal inverse filtering. Pulse-wise parameterization of the glottal area and glottal flow indicate that there is no single dominant way to initiate quasi-stable phonation. The trajectories of fundamental frequency and normalized amplitude quotient, extracted from glottal area and estimated flow, may differ markedly during onsets. The location and steepness of the amplitude envelopes of the two signals were observed to be closely related, and quantitative analysis supported the hypothesis that glottal area and flow do not carry essentially different amplitude information during vowel onsets. Linear models were used to predict the phonation onset times from the characteristics of the subsequent steady phonation. The phonation onset time of glottal area was found to have good predictability from a combination of the fundamental frequency and the normalized amplitude quotient of the glottal flow, as well as the gender of the speaker. For the phonation onset time of glottal flow, the best linear model was obtained using the fundamental frequency and the normalized amplitude quotient of the glottal flow as predictors.

    AB - A multichannel dataset comprising high-speed videoendoscopy images, and electroglottography and free-field microphone signals, was used to investigate phonation onsets in vowel production. Use of the multichannel data enabled simultaneous analysis of the two main aspects of phonation, glottal area, extracted from the high-speed videoendoscopy images, and glottal flow, estimated from the microphone signal using glottal inverse filtering. Pulse-wise parameterization of the glottal area and glottal flow indicate that there is no single dominant way to initiate quasi-stable phonation. The trajectories of fundamental frequency and normalized amplitude quotient, extracted from glottal area and estimated flow, may differ markedly during onsets. The location and steepness of the amplitude envelopes of the two signals were observed to be closely related, and quantitative analysis supported the hypothesis that glottal area and flow do not carry essentially different amplitude information during vowel onsets. Linear models were used to predict the phonation onset times from the characteristics of the subsequent steady phonation. The phonation onset time of glottal area was found to have good predictability from a combination of the fundamental frequency and the normalized amplitude quotient of the glottal flow, as well as the gender of the speaker. For the phonation onset time of glottal flow, the best linear model was obtained using the fundamental frequency and the normalized amplitude quotient of the glottal flow as predictors.

    KW - Phonation onset

    KW - Vowel production

    KW - High-speed videoendoscopy

    KW - Glottal inverse filtering

    U2 - 10.1016/j.specom.2019.03.007

    DO - 10.1016/j.specom.2019.03.007

    M3 - Article

    JO - Speech Communication

    JF - Speech Communication

    SN - 0167-6393

    ER -