Mokgadi SetshekgamolloMohohlo Samuel TšoeuRobyn Verrinder
Deaf and hard of hearing (DHH) people are estimated to exceed 5% of the total population of South Africa and the world. Additionally, majority of people with hearing abilities are not proficient in sign language. This creates a communication divide between the DHH and hearing communities, to the disadvantage of the DHH. Although some provisions are made to cater to the needs of DHH people in the form of sign language education and interpreters, such interpreters are severely scarce and often costly. In this research, we developed a first vision based neural sign language translation (NSLT) model for South African Sign Language (SASL) as an initial step towards bridging this communication barrier. To this end, we recorded a parallel SASL and English corpus with the help of six sign language interpreters. The dataset comprises of 5047 sentences in the domain of government and politics. We conducted comprehensive experiments using several visual feature extraction architectures as well as translation architectures. The best results were obtained using a 3D-CNN and recurrent models (0.51 BLEU-4), followed by transformer models (0.43 BLEU-4). These are lower than results obtained using the RWTH-PHOENIX-Weather 2014T dataset (11.73 BLEU-4) and comparable to those using the How2Sign (1.73 BLEU-4). We also investigated the impact of pretraining the model on a dataset that consists of gloss annotations and fine-tuning on the SASL dataset and increase the BLEU-4 score to 1.3. Our results on text-based translation (8.52 BLEU-4) suggest that there is a need for better feature extraction methods for sign language translation.
Archana GhotkarUdit BardeSheetal SonawaneAtharva Gokhale
Y. MadhuriG. AnithaM. Anburajan