A Speech Emotion Recognition Model Based on Multi-Level Local Binary and Local Ternary Patterns

Sönmez, Yeşim Ülgen; Varol, Asaf

A Speech Emotion Recognition Model Based on Multi-Level Local Binary and Local Ternary Patterns

Date

2020

Authors

Sönmez, Yeşim Ülgen

Varol, Asaf

Publisher

Ieee-Inst Electrical Electronics Engineers Inc

Access Rights

info:eu-repo/semantics/openAccess

Abstract

Interpreting a speech signal is quite challenging because it consists of different frequencies and features that vary according to emotions. Although different algorithms are being developed in the speech emotion recognition (SER) domain, the success rates vary according to the spoken languages, emotions, and databases. In this study, a new lightweight effective SER method has been developed that has low computational complexity. This method, called 1BTPDN, is applied on RAVDESS, EMO-DB, SAVEE, and EMOVO databases. First, low-pass filter coefficients are obtained by applying a one-dimensional discrete wavelet transform on the raw audio data. The features are extracted by applying textural analysis methods, a one-dimensional local binary pattern, and a one-dimensional local ternary pattern to each filter. Using neighborhood component analysis, the most dominant 1024 features are selected from 7680 features while the other features are discarded. These 1024 features are selected as the input of the classifier which is a third-degree polynomial kernel-based support vector machine. The success rates of the 1BTPDN reached 95.16% 89.16%, 76.67%, and 74.31%; in the RAVDESS, EMO-DB, SAVEE, and EMOVO databases, respectively. The recognition rates are higher compared to many textural, acoustic, and deep learning state-of-the-art SER methods.

Keywords

Feature Extraction, Time-Frequency Analysis, Classification Algorithms, Databases, Transforms, Support Vector Machines, Discrete Wavelet Transform, Local Binary Pattern, Local Ternary Pattern, Neighborhood Component Analysis, Speech Emotion Recognition

Journal or Series

Ieee Access

WoS Q Value

Q2

Scopus Q Value

Q1

Volume

8

URI

https://doi.org/10.1109/ACCESS.2020.3031763
https://hdl.handle.net/20.500.12415/7001

Collections

WoS İndeksli Yayınlar Koleksiyonu
Scopus İndeksli Yayınlar Koleksiyonu

Full item page

A Speech Emotion Recognition Model Based on Multi-Level Local Binary and Local Ternary Patterns

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Access Rights

Research Projects

Organizational Units

Journal Issue

Abstract

Description

Keywords

Journal or Series

WoS Q Value

Scopus Q Value

Volume

Issue

Citation

URI

Collections