Complexity control in a mixture model by the Hardy-Weinberg equilibrium

Ella Bingham, Heikki Mannila

    Tutkimustuotos: ArtikkelijulkaisuArtikkeliTieteellinenvertaisarvioitu

    Abstrakti

    A method of complexity control in multinomial mixture modeling of multiple-marker genotype data, imposing the Hardy-Weinberg equilibrium (HWE) between the genotype values, is studied. This is a very natural restriction, and known to hold at population level under modest assumptions. The hypothesis under study is that imposing this restriction will prevent overfitting and lead to a better model. This is shown to indeed be case. Experimental results an chromosomes 1 and 17 of the HapMap data demonstrate that the restricted model generalizes better to unseen data, and also finds clusters that correspond better to the ethnic groups of the HapMap, when compared with a model without the HWE restriction. (C) 2008 Elsevier B.V. All rights reserved.
    Alkuperäiskielienglanti
    LehtiComputational Statistics & Data Analysis
    Vuosikerta53
    Numero5
    Sivut1711-1719
    Sivumäärä9
    ISSN0167-9473
    DOI - pysyväislinkit
    TilaJulkaistu - 2009
    OKM-julkaisutyyppiA1 Alkuperäisartikkeli tieteellisessä aikakauslehdessä, vertaisarvioitu

    Tieteenalat

    • 113 Tietojenkäsittely- ja informaatiotieteet

    Siteeraa tätä