Handwritten Text Recognition as a digital perspective of Archival Science


  • Salvatore Spina Università degli Studi di Catania

Parole chiave:

Transkribus, Biscari Archive, Italian Administrative model, Filemaker, Digitization


The digital divide in the Humanities scientific field represents a heated debate between traditionalists (analogue scholars) and digital humanists. While some progress has been made towards a dialogue between the two sides, friction persists. Technology, including the development of artificial intelligence tools and algorithms, is not a threat to humanities research but a solution to problems. However, society has changed dramatically, and new generations require new communicative products and systems. The debate becomes increasingly wearying, but “analogues” do not consider the mutation of reality’s interpretive patterns and prefer to rely on the “death of traditional Humanities statutes”. The closed-mindedness towards Information technology and communication (ICT) tools, such as Handwritten Text Recognition (HTR), makes no sense. This study aims to demonstrate the potential of automatic transcription as a helpful tool for archival fields and research.

Riferimenti bibliografici

Adamek, Tomasz, Noel E. O’Connor, and Alan F. Smeaton. 2007. “Word Matching Using Single Closed Contours for Indexing Handwritten Historical Documents.” International Journal of Document Analysis and Recognition 9 (2): 153–65. https://doi.org/10.1007/s10032-006-0024-y.

Albertin, Fauzia, Alessandra Patera, Iwan Jerjen, S. Hartmann, Eva Peccenini, Frédéric Kaplan, Marco F.M. Stampanoni, Rolf Kaufmann, and G. Margaritondo. 2016. “Virtual Reading of a Large Ancient Handwritten Science Book.” Microchemical Journal 125 (March): 185–89. https://doi.org/10.1016/j.microc.2015.11.024.

Alpaydin, Ethem. 2020. Introduction to Machine Learning. Massachusetts Institute of Tecnology.

Archivi Storici e Biblioteche Istituto Suor Orsola Benincasa. n.d. Consultato il 10 febbraio 2023. https://archivistoriciisob.transkribus.eu/.

Bastian, Mathieu, Sebastien Heymann, and Mathieu Jacomy. 2009. “Gephi - The Open Graph Viz Platform.” https://gephi.org/.

Bishop, Christopher M. 2006. Pattern Recognition and Machine Learning. New York, NY: Springer International Edition.

Carducci, Giosuè. 1964. Tutte Le Poesie, Juvenilia, Levia Gravia, A Satana, Giambi Ed Epodi, Intermezzo, Rime Nuove, Odi Barbare, Rime E Ritmi, Canzone Di Legnano. Segrate: Rizzoli. https://www.ibs.it/tutte-poesie-ijuvenilia-levia-gravia-libri-vintage-giosue-carducci/e/2560035385391.

Collobert, Ronan, and Jason Weston. 2008. “A Unified Architecture for Natural Language Processing: Deep Neural Networks with Multitask Learning.” In Proceedings of the 25th International Conference on Machine Learning, 160–67. ICML ’08. New York, NY, USA: Association for Computing Machinery. https://doi.org/10.1145/1390156.1390177.

Dimond, Tom L. 1957. “Devices for Reading Handwritten Characters.” In Papers and Discussions Presented at the December 9-13, 1957, Eastern Joint Computer Conference: Computers with Deadlines to Meet, 232–37. IRE-ACM-AIEE ’57 (Eastern). New York, NY, USA: Association for Computing Machinery. https://doi.org/10.1145/1457720.1457765.

Dunley, Richard. 2018. “Machines Reading the Archive: Handwritten Text Recognition Software.” The National Archives Blog. The National Archives. March 19, 2018. https://blog.nationalarchives.gov.uk/machines-reading-the-archive-handwritten-text-recognition-software/.

Erwin, Brittany. 2020. “Digital Tools for Studying Empire: Transcription and Text Analysis with Transkribus.” Not Even Past. November 6, 2020. https://notevenpast.org/digital-tools-for-studying-empire-transcription-and-text-analysis-with-transkribus/.

Fondazione Banco di Napoli. n.d. Consultato il 10 febbraio 2023. http://www.fondazionebanconapoli.it/en/.

Gori, Marco. 2018. Machine Learning. A Constraint-Based Approach. Burlington, Massachusetts: Morgan Kaufmann Publishers. https://doi.org/10.1016/C2015-0-00237-4.

Kahle, Philip, Sebastian Colutto, Günter Hackl, and Günter Mühlberger. 2017. “Transkribus. A Service Platform for Transcription, Recognition and Retrieval of Historical Documents.” In 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), 19–24. https://doi.org/10.1109/ICDAR.2017.307.

Kaplan, Frédéric, and Isabella di Lenardo. 2017. “Big Data of the Past.” Frontiers in Digital Humanities 4. https://doi.org/10.3389/fdigh.2017.00012.

Massot, Marie-Laure, Arianna Sforzini, and Vincent Ventresque. 2018. “Transcribing Foucault’s Handwriting with Transkribus.” https://hal.archives-ouvertes.fr/hal-01913435.

Midura, Rachel. 2020. “Italian Administrative Hands.” Early Modern Digital Itineraries. July 21, 2020. https://emdigit.org/tool/2020/07/21/italian-administrative-hands.html.

Milioni, Nikolina. 2020. “Automatic Transcription of Historical Documents. Transkribus as a Tool for Libraries, Archives and Scholars.” PhD Diss., Uppsala Universitet. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-412565.

Moretti, Giovanni, Rachele Sprugnoli, and Sara Tonelli. 2015. “Digging in the Dirt: Extracting Keyphrases from Texts with KD.” In Proceedings of the Second Italian Conference on Computational Linguistics CLiC-It 2015. https://doi.org/10.4000/books.aaccademia.1518.

Muehlberger, Guenter, Louise Seaward, Melissa Terras, Sofia Ares Oliveira, Vicente Bosch, Maximilian Bryan, Sebastian Colutto, et al. 2019. “Transforming Scholarship in the Archives through Handwritten Text Recognition. Transkribus as a Case Study.” Journal of Documentation 75 (5): 954–76. https://doi.org/10.1108/JD-07-2018-0114.

Spina, Salvatore. 2022a. Digital History. Metodologie Informatiche per La Ricerca Storica. Napoli: Edizioni Scientifiche Italiane.

Spina, Salvatore. 2022b. “Historical Network Analysis & Htr Tool. Per un approccio storico metodologico digitale all’archivio Biscari di Catania.” Umanistica Digitale 14: 163–81. https://doi.org/10.6092/issn.2532-8816/15159.

Spina, Salvatore. 2023. Biscari Epistolography. Ministero della Cultura, MIC|MIC_AS-CT|03/02/2023|0000173-P. https://www.biscariepistolography.it/.

Turing, Alan Mathison. 1950. “Computing Machinery and Intelligence.” Mind LIX (236): 433–60. https://doi.org/10.1093/mind/LIX.236.433.

Valacchi, Federico. 2021. Gli archivi tra storia uso e futuro. Milano: Editrice Bibliografica.