RASM 2018
Recognition of Historical Arabic Scientific Manuscripts Competition
“RASM 2018 was a competition launched by the British Library in collaboration with PRImA Research Lab and the Alan Turing Institute.
The Library has an extensive collection of Arabic manuscripts, comprising of almost 15,000 works. Several hundred manuscripts have been digitized as part of the British Library/Qatar Foundation Partnership, making them available on Qatar Digital Library.”
The competition is held in hopes of finding an optimal solution for accurately and automatically transcribing the vast and growing digital archive of historical Arabic scientific handwritten manuscripts within the Qatar Digital Library.
With the aim of improving accessibility of this rich content by enabling full-text search and discovery, as well as enabling large-scale text analysis.
The competition was divided into three different challenges: page segmentation, text line detection and Optical Character Recognition (OCR).
RDI WAS ANNOUNCED WINNER OF THIS COMPETITION (1st PLACE)
RDI was competing against established systems used in industry and academia – Tesseract 4.0, ABBYY FineReader Engine 12 (FRE12), and Google Cloud Vision API – to RDI’s submitted methods.
We believe that we achieved impressive scores, given the very challenging nature of the documents.
We reached an accuracy of 81.60% in “Challenge 2 – Text Line Segmentation” by a difference of more than 13% from the second place. We also achieved an accuracy of 85.44% in “Challenge 3 – Text Recognition” by a difference of more than 20% from the second place.
Challenge 2 – Text Line Segmentation
Challenge 3 – Text Recognition