(MGB-5) 2019
Multi-Genre Broadcast Challenge
The Multi-Genre Broadcast Challenge (MGB-5), was a challenge for Arabic speech recognition and dialect identification.
MGB-5 focused on Moroccan Arabic speech recognition; and identifying different Arabic dialects in YouTube videos from 17 Arabic speaking countries.
RDI WAS ANNOUNCED WINNER OF THIS COMPETITION (1st PLACE)
The challenge was divided into two tasks. The first task was to recognize and transcribe 13 hours of YouTube videos with Moroccan speech content.
The videos belonged to a range of different genres such as comedy, cooking, family/kids, fashion, drama, sports, and science (TEDx).
The second task was identifying the Arabic dialect present in various Youtube videos from 17 Arabic countries.
The dialect identification data was divided into three sub-categories based on the segment duration: short (under 5 s), medium (5–20 s), and long (>20 s).
Our submission achieved the lowest error rates in the speech-to-text task, Marking RDI-CU as the winner of the first task in the challenge.
It is also worth noting that RDI was ranked third in the MGB-2 Arabic speech transcription challenge, exceeding the score of the MIT system, and tying with John Hopkins University (JHU).