Word Associations as a Language Model for Generative and Creative Tasks
Hannu Toivonen, University of Helsinki, Finland

Tony Veale, University College Dublin, Ireland
Krista Lagus, University of Helsinki, Finland

Timo Honkela, University of Helsinki, Finland

In order to analyse natural language and gain a better understanding of documents, a common approach is to produce a language model which
creates a structured representation of language which could then be used
further for analysis or generation. This thesis will focus on a fairly simple
language model which looks at word associations which appear together
in the same sentence. We will revisit a classic idea of analysing word co-
occurrences statistically and propose a simple parameter-free method for
extracting common word associations, i.e. associations between words that
are often used in the same context (e.g., Batman and Robin). Additionally we propose a method for extracting associations which are speci c to a
document or a set of documents. The idea behind the method is to take into
account the common word associations and highlight such word associations
which co-occur in the document unexpectedly often. ...

