The humanities meet computer science to create new synergies using computer vision and natural language processing.
Aim & Scope
Historians are increasingly using technologies to evaluate digitised texts in a machine-readable way, as well as techniques from the field of natural language processing (NLP) to analyse the content and context of language in written artefacts. These techniques can be used to analyse large corpora and identify patterns. In general, however, these methods often use training data from current rather than historical data. The use of these methods can lead to biases in the historical record, incurring the risk of false inferences about history. Therefore, the methods used should be fully investigated to account for any biases. In this DL workshop, the challenges of applying computer vision and NLP techniques in the humanities, and first solutions to them, will be presented.
This entry includes the following presentations from the first Data Linking Workshop 2023: Computer Vision and Natural Language Processing – Challenges in the Humanities
Pepper, Welcome
Eva Wilden, Charles Li: Tamilex -- Digital Lexicography
Stefan Baums, Stephen White: Computer Vision and Kharoṣṭhī Paleography
Oskar von Hinüber, Haiyan Hu-von Hinüber, Sylvia Melzer: What the Buddhological Epigraphy can expect from the AI: The Information System "Buddhist Bronzes Inscriptions"
Kathrin Holz: The Proto-Śāradā Project: Towards the edition of a new collection of administrative letters and documents from pre-modern South Asia
Ines Konczak-Nagel, Erik Radisch:
The Kucha Mural Information System: Taxonomy and Semi-Automated Image Recognition
Ralf Möller: Aligned AI and the role of the humanities: Training AI systems using human feedback
Isabelle Marthot-Santaniello: The application of NLP in combination with Computer Vision for analysing ancient Greek handwritings on papyri
Olga Serbaeva: Some features of the 17th century Newārī script: READ-based statistical approach to palaeography
Lena Hinrichsen: OCR technologies in research practice
Oliver Hellwig: Web-based information systems for Indian scripts and texts
Simon Schiff, Ralf Möller: Persistent Data, Sustainable Information
Hamid Reza Hakimi, Lisa Mischer, Tariq Yousef, Maxim Romanov: Finding and Linking Information in Arabic Historical Texts
Sylvia Melzer: Building Information Systems on Demand with ChatGPT?
Martin Braun, Hannes Fellner, Bernhard Koller: A Digital Paleography of Tarim Brahmi
Hussein Mohammed: Computer vision beyond OCR: potentials and challenges for the study of written artefact
The submitted presentations are included in this upload for which permission to publish has been granted.
The KI2021 workshop – Humanities-Centred AI was supported by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany’s Excellence Strategy - EXC 2176 ’Understanding Written Artefacts: Material, Interaction and Transmission in Manuscript Cultures’, project no. 390893796.