Department of Computer Science, Bahir Dar Institute of Technology, Bahir Dar University, Bahir Dar, Ethiopia
Research Article
Large Scale Speech Recognition for Low Resource Language Amharic, an End-to-End Approach
Author(s): Yohannes Ayana Ejigu* and Tesfa Tegegne Asfaw
Speech recognition, or Automatic Speech Recognition (ASR), is a technology designed to convert spoken language
into text using software. However, conventional ASR methods involve several distinct components, including
language, acoustic, and pronunciation models with dictionaries. This modular approach can be time-consuming and
may influence performance. In this study, we propose a method that streamlines the speech recognition process by
incorporating a unified Recurrent Neural Network (RNN) architecture. Our architecture integrates a Convolutional
Neural Network (CNN) with an RNN and employs a Connectionist Temporal Classification (CTC) loss function.
Key experiments were carried out using a dataset comprising 576,656 valid sentences, using erosion techniques.
Evaluation of the model performance, measured by the Word Error Rate (WER) metric, demonstrated re.. View more»