Guest lecture 19.3.: Alessandro Papini, ’When the future meets the past: AI-assisted normalisation of historical Late Latin (ChLA) texts with AI_Stripertus’

”The Classics Research Seminar has a special guest lecture by Dr. Alessandro Papini (University of Venezia Ca’ Foscari) titled “When the future meets the past: AI-assisted normalisation of historical Late Latin (ChLA) texts with AI_Stripertus” on Thursday, March 19, at 4–6 pm. The lecture room is Metsätalo 4 (B214). All welcome.

Abstract

When the future meets the past: AI-assisted normalisation of historical Late Latin (ChLA) texts with AI_Stripertus

Following the publication of Attention is All You Need in 2017 and the affirmation of the Transformer architecture, there has been a rapid increase in the adoption of generative AI and large language models (LLMs) in scientific research (Haber et al. 2025), including the study of Latin and Greek heritage (Assel et al. 2022; 2025). However, thus far, AI’s capabilities have primarily been used to help scholars date and establish the provenance of ancient texts (mostly inscriptions) and fill gaps in historical documents, as part of large-scale projects such as Ithaca and Aeneas (Tupman 2025). This paper aims to demonstrate how Transformers can also fulfil complex natural language processing (NLP) tasks for closed-corpus languages such as Latin. To this end, the paper will discuss AI_Stripertus, an NLP tool leveraging agentic AI to achieve text normalisation/style transfer (i.e., “translation” from “Vulgar” to “Classical” Latin; cf. Herman 2000 and Clackson 2011) in the parchment and papyrus documents of the Chartae Latinae Antiquiores (Bruckner and Marichal 1954-1985). Specifically, the paper will illustrate how the flexibility of Transformers can satisfactorily achieve this objective with limited “training” and evaluation data, contrasting with more traditional NLP frameworks. Furthermore, the paper will discuss the contemporary linguistic/AI engineering techniques necessary for text normalisation in Latin, including, the definition of a “gold standard” for evaluation (benchmark), the creation of a knowledge base for the model (Retrieval Augmented Generation), and the implementation of effective prompting methods (structured prompt with few shots). Finally, the paper will compare the performance of different models for the same normalisation task (reasoning vs. non-reasoning), as well as discussing the future developments and applications of AI_Stripertus in the context the ERC project Digital Latin Dialectology (DiLaDi) at Ca’Foscari University of Venice.

Bibliography

Assael, Y. et al. (2022). Restoring and attributing ancient texts using deep neural networks. Nature, 603, 280-299.

Assael, Y. et al. (2025). Contextualizing ancient texts with generative neural networks. Nature, 645, 141-164.

Bruckner, A. and Marichal, R. (1954-1985). Chartae Latinae Antiquiores. Facsimile-edition of the Latin charters prior to the Ninth Century. Olten.

Clackson, J. (2011). Classical Latin. In, Clackson, J. (ed.), A companion to the Latin language. Chichester, West Sussex and Malden, MA, pp. 236-256.

Haber, E. et al. (2025). Generative AI for Research. In, Haber, E. et al. (eds.), Using AI in Academic Writing and Research. Cham, pp. 27-38.

Herman, J. (2000). Vulgar Latin. University Park, Pa.

Tupman, C. (2025). AI for Latin inscriptions supplies missing text and predicts date and location. Nature, 645, 43-44.

Vaswani, A. et al. (2017). Attention is all you need. Advances in neural information processing systems, 30.”

Lähde: Timo Korkiakangas and Marja Vierros

< Takaisin uutisiin