Abstract
This paper presents the methodological, theoretical and practical aspects of the CoLaGe corpus (Corpus for the Study of Language and Gender in Spanish), an oral bi-dialectal corpus of Spanish collected in Valencia (Spain) and Guadalajara (Mexico). The corpus consists of three sub-corpora, one CoLaGe-GD) each containing three types of linguistic data: sociolinguistic interviews, roleplays and phonetic data, elicited through picture description tasks to elicit phonetic data. The linguistic data is complemented with a social psychological database published separately. Whilst the corpus has been designed for a research project, studying inter-relations between speaker's gender, sexuality and language use in two societies sharing the same language (but arguably differing in terms of gender norms and roles) it can be used for many different research areas ranging from gender studies to discourse analysis. The structure of the data permits quantitative comparisons across dialects, age groups and genders.
| Original language | English |
|---|---|
| Journal | Corpora |
| Volume | 20 |
| Issue number | 2 |
| Pages (from-to) | 269-285 |
| Number of pages | 17 |
| ISSN | 1749-5032 |
| DOIs | |
| Publication status | Published - 9 Sept 2025 |
| MoE publication type | A1 Journal article-refereed |
Fields of Science
- 6121 Languages
- Spanish corpus
- Corpus design
- Gender
- Linguistic variation
- Orality