Käytettävä äärellistilainen malli adekvaatille kieliopilliselle kompleksisuudelle

    Projekt: Forskningsprojekt

    Projektinformation

    Beskrivning (abstrakt)

    I hope to be the first who will present a compact finite-state automaton (FSA) assigning adequate, non-projective dependency analyses to natural language sentences. Recently, I have finally discovered FSAs that can be represented compactly, while conventional FSAs would be too large to fit into a physical computer. Conventional automata work on only one input tape, but these new automata have additional hidden tapes storing the derivation of a dependency-syntactic analysis (Yli-Jyrä 2012). If these tapes are factored and separated, the immense size of the representation of the automaton collapses. The result is a practical and highly efficient system that still has the excellent algebraic properties that characterise finite automata.

    The purpose of the proposed project is to transform the discovered idea into a new methodology by demonstrating a finite-state dependency syntactic grammar and its parser (analyser) that works linguistically adequately in practice. That is, it covers the observed non-projective structures and generalises it to unseen data via language-specific knowledge and linguistically universal statistical learning.

    The new methodology will be excellent due to its combinatorial characteristics and faithfulness to the observed complexity of natural language. It does not trivialise the structural ambiguity of natural language like the state-of-the-art methods but rather uses its compact representation to factor and store the ambiguity as well. The stored ambiguity could be further disambiguated with highly accurate constraint grammars or language models by exploiting the closure properties of finite-state languages. This does not trivialise the lexical features, but models full lexical frames instead. It seems also possible that the empirical studies will motivate a rather low characterisation for the descriptive complexity of the parser, which would advance the Occamistic elegance in the linguistic methodology. Moreover, the existence of a fast and structurally accurate and language-independent finite-state parser would be a significant contribution in the debate that concerns the adequacy of finite-state grammars and the practical relevance of the generative, idealistic view of language.

    Allmän beskrivning

    The project will develop methodological ideas needed to realise the notion of finite-state syntax in practice.
    AkronymADEQSYNTAX
    StatusSlutfört
    Gällande start-/slutdatum01/09/201330/04/2019

    Finansiering

    • Suomen tietokirjailijat: 3 000,00 €

    Vetenskapsgrenar

    • 113 Data- och informationsvetenskap
    • 6121 Språkvetenskaper
    • 111 Matematik