Efficient Parsing with Finite-State Constraint Satisfaction



Finite-state automata are efficient and, therefore, widely employed in human language technology. Koskenniemi's Finite-State Intersection Grammar (FSIG) is a language-independent method for shallow parsing. Surprisingly, the state complexity of FSIG is so large that even linear parsing complexity is not sufficiently efficient for parsing.

By allowing the grammar presentation to grow with the sentence length as if a new grammar were chosen for each input sentence, we define an intractable problem called Finite-State Constraint Satisfaction Problem (FSCSP). FSCSP can, however, be reduced into a Constraint Satisfaction Problem (CSP), whose restrictions are tractable under certain known conditions.

The hypothesis is that such efficient CSP fragments coincide with grammars that build explicit trees, such as context-free grammar and the tree-adjoining grammar. This will imply that the lately found model-theoretic encoding of many grammar formalisms can be expressed within the FSCSP framework. In my work, this theoretical possibility is applied and a new finite-state parser is developed.
Gällande start-/slutdatum01/01/200213/09/2005


  • 612 Språk och litteratur
  • 113 Data- och informationsvetenskap