Abstract
New Testament textual critics have for decades calculated the similarities between the manuscripts in a similar manner, using collations and variation units. This conventional methodology requires enormous amounts of time and manual work. Here is proposed a new method that does not require these preprocessing steps, enabling the establishment of quantitative relationships using manuscript transcriptions only. This is achieved by applying a technique called shingling, where the manuscript transcriptions are turned in a computerized manner into smaller pieces called tokens or k-grams. Then, a string metric is used to calculate the similarities between the tokenized strings. This method is efficient, meaning that it allows critics to consider all textual evidence in each manuscript tradition. At the same time, it returns similarity values that are compatible with those of conventional approaches.
Original language | English |
---|---|
Journal | Digital Scholarship in the Humanities |
Volume | 38 |
Issue number | 1 |
Pages (from-to) | 151-166 |
Number of pages | 16 |
ISSN | 2055-7671 |
DOIs | |
Publication status | Published - 3 Apr 2023 |
MoE publication type | A1 Journal article-refereed |
Fields of Science
- 614 Theology
- 113 Computer and information sciences