Modeling language evolution with codes that utilize context and phonetic features

Javad Nouri, Roman Yangarber

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review


We present methods for investigating processes of evolution in a language family by modeling relationships among the observed languages.

The models aim to find regularities---regular correspondences in lexical data. We present an algorithm which codes the data using phonetic features of sounds, and learns long-range contextual rules that condition recurrent sound correspondences between languages. This gives us a measure of model quality: better models find more regularity in the data. We also present a procedure for imputing unseen data, which provides another method of model comparison. Our experiments demonstrate improvements in performance compared to prior work.
Original languageEnglish
Title of host publicationThe 20th SIGNLL Conference on Computational Natural Language Learning (CoNLL) : Proceedings of the Conference
Number of pages10
Place of PublicationStroudsburg, PA
PublisherThe Association for Computational Linguistics
Publication date2016
ISBN (Print)978-1-945626-19-7
Publication statusPublished - 2016
MoE publication typeA4 Article in conference proceedings
EventConference on Computational Natural Language Learning - Berlin, Germany
Duration: 11 Aug 201612 Aug 2016
Conference number: 20

Bibliographical note

CoNLL 2016

Fields of Science

  • 113 Computer and information sciences

Cite this