background
logo
ArxivPaperAI

An Attention Long Short-Term Memory based system for automatic classification of speech intelligibility

Author:
Miguel Fernández-Díaz, Ascensión Gallardo-Antolín
Keyword:
Electrical Engineering and Systems Science, Audio and Speech Processing, Audio and Speech Processing (eess.AS), Machine Learning (cs.LG)
journal:
Miguel Fernandez-Diaz and Ascension Gallardo-Antolin Engineering Applications of Artificial Intelligence 96 (2020) 103976
date:
2024-02-05 00:00:00
Abstract
Speech intelligibility can be degraded due to multiple factors, such as noisy environments, technical difficulties or biological conditions. This work is focused on the development of an automatic non-intrusive system for predicting the speech intelligibility level in this latter case. The main contribution of our research on this topic is the use of Long Short-Term Memory (LSTM) networks with log-mel spectrograms as input features for this purpose. In addition, this LSTM-based system is further enhanced by the incorporation of a simple attention mechanism that is able to determine the more relevant frames to this task. The proposed models are evaluated with the UA-Speech database that contains dysarthric speech with different degrees of severity. Results show that the attention LSTM architecture outperforms both, a reference Support Vector Machine (SVM)-based system with hand-crafted features and a LSTM-based system with Mean-Pooling.
PDF: An Attention Long Short-Term Memory based system for automatic classification of speech intelligibility.pdf
Empowered by ChatGPT