Kuvaus
Talk: Consensus in an intuitive chunking taskAuthors: Aleksandra Dobrego, Alena Konina, Svetlana Vetchinnikova, Nitin Williams, Anna Mauranen
Language arranges itself along a continuous line, either in time (speech) or in space (text). As human working memory is finite and presumably limited to approx. 4 units of information (Cowan 2001), it is largely uncontested by now that language processing must proceed in some kind of units or chunks (see Christiansen and Chater 2016). What these online chunks are, is a more debatable issue. We report on an experiment aiming at investigating language segmentation into large high-level units, or chunking. Participants (45 students of the University of Helsinki, 32 females, range: 20-39 years old) with no previous linguistic training listened to short (approx. 30 seconds) audio extracts of authentic data while simultaneously marking what they intuitively felt were natural boundaries on broad orthographic transcripts (see Vetchinnikova et al. 2017). The perceived boundaries resulted in ‘chunks’. The experiment was carried out via a web-based application presented on a tablet. Participants marked boundaries by tapping an interactive tilda sign shown between all orthographic words in the transcript; they had the option of removing their mark by tapping the symbol again. To assess inter-participant agreement in chunking behaviour, we first used a conventional measure of inter-rater reliability, i.e. proportion of all pairs of individual participants who agreed on the presence (1) or absence (0) of a boundary. We computed this proportion for every space between two orthographic words: the mean value across them was treated as the overall measure of agreement. Values close to 0 indicated weak agreement, while values close to 1 indicated strong agreement. Since this measure does not account for chance agreement, we also calculated another measure of inter-rater reliability, Fleiss’ kappa, which corrects the measure of agreement described above for chance agreement. The total number of possible boundaries in our dataset is equal to 4692. The average number of boundaries one participant marked amounted to 455 (SD = 211.2). The agreement according to the first measure was 0.9, indicating strong agreement in chunking behaviour across participants. This value reduced to 0.45 for the Fleiss’ kappa, indicating moderate agreement. We hypothesised that this reduction is due to highly unequal distributions in 0 and 1 in the dataset. Given that the differences in proportion measure and Fleiss’ kappa can be explained by unequal distribution of boundaries and non-boundaries in the data (agreement on a non-boundary occurs much more often than agreement on a boundary), we claim that the online chunking task used in the study allows us to capture consensus in segmentation strategies.
| Aikajakso | 27 heinäk. 2020 → 29 heinäk. 2020 |
|---|---|
| Tapahtuman otsikko | UK Cognitive Linguistics conference |
| Tapahtuman tyyppi | Konferenssi |
| Sijainti | Birmingham, BritanniaNäytä kartalla |
| Tunnustuksen arvo | Kansainvälinen |
Asiakirjat ja linkit
Tähän liittyvä sisältö
-
Projektit
-
Chunking in language: units of meaning and processing
Projekti: Tutkimusprojekti
-
Palkinnot
-
Finnish Academy of Science and Letters grant
Palkinto: Palkinnot ja kunnianosoitukset