Identification of Programmers from Typing Patterns

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review


Being able to identify the user of a computer solely based on their typing patterns can lead to improvements in plagiarism detection, provide new opportunities for authentication, and enable novel guidance methods in tutoring systems. However, at the same time, if such identification is possible, new privacy and ethical concerns arise. In our work, we explore methods for identifying individuals from typing data captured by a programming environment as these individuals are learning to program. We compare the identification accuracy of automatically generated user profiles, ranging from the average amount of time that a user needs between keystrokes to the amount of time that it takes for the user to press specific pairs of keys, digraphs. We also explore the effect of data quantity and different acceptance thresholds on the identification accuracy, and analyze how the accuracy changes when identifying individuals across courses. Our results show that, while the identification accuracy varies depending on data quantity and the method, identification of users based on their programming data is possible. These results indicate that there is potential in using this method, for example, in identification of students taking exams, and that such data has privacy concerns that should be addressed.
Original languageEnglish
Title of host publicationProceedings of the 15th Koli Calling Conference on Computing Education Research
Number of pages8
Place of PublicationNew York
Publication date19 Nov 2015
ISBN (Electronic)978-1-4503-4020-5
Publication statusPublished - 19 Nov 2015
MoE publication typeA4 Article in conference proceedings
EventKoli Calling International Conference on Computing Education Research - Lieksa, Finland
Duration: 19 Nov 201522 Nov 2015
Conference number: 15

Publication series

NameKoli Calling '15

Fields of Science

  • biometric feedback, educational data mining, keystroke analysis, programming data, source code snapshots, student identification
  • 113 Computer and information sciences

Cite this