Enriching Historical Records: An OCR and AI-Driven Approach for Database Integration
By: Zahra Abedi, Richard M.K. van Dijk, Gijs Wijnholds, Tessa Verhoef
Published: 2026-01-01
View on arXiv →#cs.AI
Abstract
This paper introduces an AI and Optical Character Recognition (OCR)-driven pipeline for digitizing and integrating historical documents into databases. It addresses challenges like layout variability and terminology differences, making vast amounts of historical data more accessible and usable for research and digital humanities.