Significance of Patterns in Data Visualisations

Forskningsoutput: Kapitel i bok/rapport/konferenshandlingKonferensbidragVetenskapligPeer review

Sammanfattning

In this paper we consider the following important problem: when we explore data visually and observe patterns, how can we determine their statistical significance? Patterns observed in exploratory analysis are traditionally met with scepticism, since the hypotheses are formulated while viewing the data, rather than before doing so. In contrast to this belief, we show that it is, in fact, possible to evaluate the significance of patterns also during exploratory analysis, and that the knowledge of the analyst can be leveraged to improve statistical power by reducing the amount of simultaneous comparisons. We develop a principled framework for determining the statistical significance of visually observed patterns. Furthermore, we show how the significance of visual patterns observed during iterative data exploration can be determined. We perform an empirical investigation on real and synthetic tabular data and time series, using different test statistics and methods for generating surrogate data. We conclude that the proposed framework allows determining the significance of visual patterns during exploratory analysis.

Originalspråkengelska
Titel på värdpublikationKDD'19: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
Antal sidor9
UtgivningsortNew York, NY
FörlagACM
Utgivningsdatum2019
Sidor1509-1517
ISBN (tryckt)978-1-4503-6201-6
DOI
StatusPublicerad - 2019
MoE-publikationstypA4 Artikel i en konferenspublikation
EvenemangACM SIGKDD International Conference on Knowledge Discovery and Data Mining - Anchorage, Förenta Staterna (USA)
Varaktighet: 4 aug. 20198 aug. 2019
Konferensnummer: 25

Vetenskapsgrenar

  • 113 Data- och informationsvetenskap

Citera det här