The Engineering Company for the Development of Digital Systems

Contact Info
12A Haroun,Doqi, Giza Governorate, Egypt
info@rdi-eg.com
+20 2 37 49 94 63 +20 2 37 49 55 66 +20 2 37 49 95 61

RASM 2019

RASM 2019

Competition on Recognition of Historical Arabic Scientific Manuscripts

“RASM 2019 was a competition launched by the British Library in collaboration with PRImA Research Lab and the Alan Turing Institute.  This competition was held in the context of the 15th International Conference on Document Analysis and Recognition (ICDAR2019).

The Library has an extensive collection of Arabic manuscripts, comprising of almost 15,000 works. Several hundred manuscripts have been digitized as part of the British Library/Qatar Foundation Partnership, making them available on Qatar Digital Library.”

The competition is held in hopes of finding an optimal solution for accurately and automatically transcribing the vast and growing digital archive of historical Arabic scientific handwritten manuscripts within the Qatar Digital Library. With the aim of improving accessibility of this rich content by enabling full-text search and discovery, as well as enabling large-scale text analysis.

The competition was divided into three different challenges: page segmentation, text line detection and Optical Character Recognition (OCR).

RDI WAS ANNOUNCED WINNER OF THIS COMPETITION (1st PLACE)

RDI was competing against established systems used in industry and academia – Tesseract 4.0, ABBYY FineReader Engine 12 (FRE12), and Google Cloud Vision API.

Given the very challenging nature of the documents, reaching an accuracy of 77.60% in “Challenge 2 – Text Line Segmentation” by a difference of more than 24% from the second place was a big win for us.

We also achieved an accuracy of 77.58% in “Challenge 3 – Text Recognition” by a difference of more than 14% from the second place.

Challenge 2 – Text Line Segmentation

 

Challenge 3 – OCR Accuracy