Journal Articles

Deep learning for spoken language identification: Can we visualize speech signal patterns?

Himadri Mukherjee, West Bengal State University
Subhankar Ghosh, Indian Statistical Institute, Kolkata
Shibaprasad Sen, Future Institute of Engineering and Management
Obaidullah Sk Md, Aliah University
K. C. Santosh, University of South Dakota
Santanu Phadikar, Maulana Abul Kalam Azad University of Technology
Kaushik Roy, West Bengal State University

Article Type

Research Article

Publication Title

Neural Computing and Applications

Abstract

Western countries entertain speech recognition-based applications. It does not happen in a similar magnitude in East Asia. Language complexity could potentially be one of the primary reasons behind this lag. Besides, multilingual countries like India need to be considered so that language identification (words and phrases) can be possible through speech signals. Unlike the previous works, in this paper, we propose to use speech signal patterns for spoken language identification, where image-based features are used. The concept is primarily inspired from the fact that speech signal can be read/visualized. In our experiment, we use spectrograms (for image data) and deep learning for spoken language classification. Using the IIIT-H Indic speech database for Indic languages, we achieve the highest accuracy of 99.96%, which outperforms the state-of-the-art reported results. Furthermore, for a relative decrease of 4018.60% in the signal-to-noise ratio, a decrease of only 0.50% in accuracy tells us the fact that our concept is fairly robust.

First Page

8483

Last Page

8501

DOI

10.1007/s00521-019-04468-3

Publication Date

12-1-2019

Recommended Citation

Mukherjee, Himadri; Ghosh, Subhankar; Sen, Shibaprasad; Sk Md, Obaidullah; Santosh, K. C.; Phadikar, Santanu; and Roy, Kaushik, "Deep learning for spoken language identification: Can we visualize speech signal patterns?" (2019). Journal Articles. 591.
https://digitalcommons.isical.ac.in/journal-articles/591

Link to Full Text

COinS

Journal Articles

Deep learning for spoken language identification: Can we visualize speech signal patterns?

Article Type

Publication Title

Abstract

First Page

Last Page

DOI

Publication Date

Recommended Citation

Browse

Search

Author Corner

Links

Journal Articles

Deep learning for spoken language identification: Can we visualize speech signal patterns?

Authors

Article Type

Publication Title

Abstract

First Page

Last Page

DOI

Publication Date

Recommended Citation

Share

Browse

Search

Author Corner

Links