Quantifying variation and estimating the effects of sample size on the frequencies of linguistic variables

Research output: Chapter in Book/Report/Conference proceedingChapterScientificpeer-review

Abstract

Estimating the frequency of linguistic variables is a fundamental task in the analysis of linguistic data. However, as it is often the case that the amount of material available from different people or text categories may vary, the simplest methods of calculating frequencies are not always appropriate. In this article, we discuss different approaches, including bootstrap methods and a Bayesian approach, and compare the results they yield with those given by some of the simple measures in common use, such as pooling and averaging. We also study the effect of sample size on the accuracy of the estimates.
Original languageEnglish
Title of host publicationResearch Methods in Language Variation and Change
EditorsManfred Krug, Julia Schlüter
Number of pages24
Place of PublicationCambridge
PublisherCambrigde University Press
Publication date2013
Pages337-360
ISBN (Print)9780521181860
Publication statusPublished - 2013
MoE publication typeA3 Book chapter

Fields of Science

  • 113 Computer and information sciences
  • 6121 Languages
  • bootstrap
  • Bayesian analysis
  • corpus linguistcs
  • linguistic variable

Cite this

Mannila, H., Nevalainen, T., & Raumolin-Brunberg, H. (2013). Quantifying variation and estimating the effects of sample size on the frequencies of linguistic variables. In M. Krug, & J. Schlüter (Eds.), Research Methods in Language Variation and Change (pp. 337-360). Cambridge: Cambrigde University Press.
Mannila, Heikki ; Nevalainen, Terttu ; Raumolin-Brunberg, Helena. / Quantifying variation and estimating the effects of sample size on the frequencies of linguistic variables. Research Methods in Language Variation and Change. editor / Manfred Krug ; Julia Schlüter. Cambridge : Cambrigde University Press, 2013. pp. 337-360
@inbook{c81d0f4d9280481b83a8a6cc929f3151,
title = "Quantifying variation and estimating the effects of sample size on the frequencies of linguistic variables",
abstract = "Estimating the frequency of linguistic variables is a fundamental task in the analysis of linguistic data. However, as it is often the case that the amount of material available from different people or text categories may vary, the simplest methods of calculating frequencies are not always appropriate. In this article, we discuss different approaches, including bootstrap methods and a Bayesian approach, and compare the results they yield with those given by some of the simple measures in common use, such as pooling and averaging. We also study the effect of sample size on the accuracy of the estimates.",
keywords = "113 Computer and information sciences, 6121 Languages, bootstrap, Bayesian analysis, corpus linguistcs , linguistic variable",
author = "Heikki Mannila and Terttu Nevalainen and Helena Raumolin-Brunberg",
year = "2013",
language = "English",
isbn = "9780521181860",
pages = "337--360",
editor = "Manfred Krug and Julia Schl{\"u}ter",
booktitle = "Research Methods in Language Variation and Change",
publisher = "Cambrigde University Press",
address = "United Kingdom",

}

Mannila, H, Nevalainen, T & Raumolin-Brunberg, H 2013, Quantifying variation and estimating the effects of sample size on the frequencies of linguistic variables. in M Krug & J Schlüter (eds), Research Methods in Language Variation and Change. Cambrigde University Press, Cambridge, pp. 337-360.

Quantifying variation and estimating the effects of sample size on the frequencies of linguistic variables. / Mannila, Heikki; Nevalainen, Terttu; Raumolin-Brunberg, Helena.

Research Methods in Language Variation and Change. ed. / Manfred Krug; Julia Schlüter. Cambridge : Cambrigde University Press, 2013. p. 337-360.

Research output: Chapter in Book/Report/Conference proceedingChapterScientificpeer-review

TY - CHAP

T1 - Quantifying variation and estimating the effects of sample size on the frequencies of linguistic variables

AU - Mannila, Heikki

AU - Nevalainen, Terttu

AU - Raumolin-Brunberg, Helena

PY - 2013

Y1 - 2013

N2 - Estimating the frequency of linguistic variables is a fundamental task in the analysis of linguistic data. However, as it is often the case that the amount of material available from different people or text categories may vary, the simplest methods of calculating frequencies are not always appropriate. In this article, we discuss different approaches, including bootstrap methods and a Bayesian approach, and compare the results they yield with those given by some of the simple measures in common use, such as pooling and averaging. We also study the effect of sample size on the accuracy of the estimates.

AB - Estimating the frequency of linguistic variables is a fundamental task in the analysis of linguistic data. However, as it is often the case that the amount of material available from different people or text categories may vary, the simplest methods of calculating frequencies are not always appropriate. In this article, we discuss different approaches, including bootstrap methods and a Bayesian approach, and compare the results they yield with those given by some of the simple measures in common use, such as pooling and averaging. We also study the effect of sample size on the accuracy of the estimates.

KW - 113 Computer and information sciences

KW - 6121 Languages

KW - bootstrap

KW - Bayesian analysis

KW - corpus linguistcs

KW - linguistic variable

M3 - Chapter

SN - 9780521181860

SP - 337

EP - 360

BT - Research Methods in Language Variation and Change

A2 - Krug, Manfred

A2 - Schlüter, Julia

PB - Cambrigde University Press

CY - Cambridge

ER -

Mannila H, Nevalainen T, Raumolin-Brunberg H. Quantifying variation and estimating the effects of sample size on the frequencies of linguistic variables. In Krug M, Schlüter J, editors, Research Methods in Language Variation and Change. Cambridge: Cambrigde University Press. 2013. p. 337-360