The Longitudinal Corpus of Finnish Spoken in Helsinki (1970’s, 1990’s and 2010’s)

  • Lappalainen, Hanna (Project manager)
  • Syrjänen, Suvi (Participant)
  • Marttila, Saila (Participant)
  • Surkka, Sanni (Participant)
  • Latvala, Pauliina (Participant)
  • Paunonen, Heikki (Principal Investigator)
  • Sorjonen, Jenna (Participant)
  • Mustanoja, Liisa (Participant)
  • O'Dell, Michael (Participant)

    Project: Research project

    Project Details

    Description (abstract)

    The corpus contains interviews with people of different ages born in Helsinki. The data were collected over three time periods: 1972–74, 1991–92 and 2013. The material consists of one hour long audio recordings of individual interviews.

    The project is a continuation of a study started in the beginning of the 1970’s by Terho Itkonen and Heikki Paunonen at the University of Helsinki. This survey was the first major sociolinguistic project in Finland. Altogether 149 native male and female Helsinkians, representing three different age groups (b. 1900–1907, 1927–32, 1952–57), two districts and four social groups, were interviewed. Later a sample of 96 speakers was chosen for linguistic analysis. The project was funded by the Academy of Finland.

    Twenty years later, in 1991–92, new data were collected. The aim of the follow-up project was to reinterview as many informants as possible from the young or middle-aged groups in the 1970’s, as well as to interview a new group of young people. 29 informants were reached for a re-interview. In addition, a new group of 16 young people (b. 1971–75) was interviewed. Half of them were high school students and the other half studied in vocational schools – this was similar to the 1970’s data. This project was funded by the Academy of Finland and it was headed by Heikki Paunonen.

    The third round of data collection was carried out in 2013, as suggested by Heikki Paunonen. 27 of the informants from the 1990s were interviewed again. This was the third interview for 13 of them (9 of them b. 1952–57, 4 of them b. 1927–32); even 14 of 16 informants who were adolescents in the 1990’s (b. 1971–75) were successfully contacted and interviewed again. In addition, a new group of young people (b. 1994 –97) was interviewed. They were chosen based on the same criteria as in the previous data collections. The old and middle-aged informants were chosen by using population registers; the young interviewees were found mainly through schools with the aid of their teachers. Informants from previous interviews were found using multiple channels.

    In addition to collecting new data, the aim of this latest stage was to improve the availability of the whole corpus. The older data were digitalized in its entirety and the most important parts were coded thematically and transcriptions were combined with audio files using Praat. The third stage of the project in 2013 was funded by the Kone Foundation and lead by Hanna Lappalainen. The corpus was constructed in cooperation with Fin-Clarin and the Institute for the Languages of Finland.

    The longitudinal corpus of Finnish spoken in Helsinki consists of 239 interviews; over 200 of them have been at least partly transcribed. In most cases the transcriptions extend for about half an hour; all the interviews of the 13 informants interviewed three times have been transcribed in their entirety. Work on the transcription, alignment and thematic coding of the corpus is planned to continue in the future. The interviews were done and transcribed by research assistants who have been students of Finnish. Pauliina Latvala planned and carried out the thematic coding.

    The corpus is available in Kielipankki - the Language Bank of Finland (, Interested parties can apply for access rights through the Language Bank of Finland. Researchers need a personal account to access the corpus. In order to gain access, a research plan outlining the purpose for the use of the resource must be provided. Access rights are limited due to personal data protection issues.

    Although not all of the interviews contain exactly the same questions, they deal with the same topics: issues related to school, work and spare time, as well as what it is like to live in Helsinki. In addition to this, the interviews contain questions related to the interviewees’ perception of the languages and language forms spoken in Helsinki. It can be used, and has been used, for various research areas, not restricted to linguistics — it also offers possibilities for sociologist, folkloristic and historic research.

    The databases from the 1970’s and 1990’s have been studied mainly by Heikki Paunonen who has concentrated on the phonological and morphological variation; in addition, the data has been used for several MA theses. The newest corpus collected in 2013 has been utilized in BA and MA theses, and more research is in progress. The work completed so far shows Helsinki changing dynamically from the late 19th century onwards as a linguistic melting pot that has been guided by different ideological or social trends.
    Effective start/end date01/01/2013 → …

    Fields of Science

    • 6121 Languages
    • real-time study
    • sociolinguistics
    • variation analysis