Abstract
This work investigates the significance of Hilbert domain characterization of wavelet packets in classifying different emotion of speech signal. The goal of this paper is to create a new emotional speech database and introduce a new feature extraction approach that can recognize various emotions. The proposed feature, wavelet cepstral coefficients (WCC) are based on Hilbert spectrum analysis of the wavelet packet of the speech signal. The speaker-independent machine learning models are developed using multiclass support vector machine (SVM) and k-nearest neighbourhood (KNN) classifier. The approach is tested with newly developed Telugu Indian database and the EMOVO (Italian emotional speech) database. Our proposed wavelet features achieve a peak accuracy of 73.5%, further boosted by NCA feature selection by 3–5%, resulting in an improved unweighted average recall (UAR) of 78% for database 1 and 87.50% for database 2, employing optimal wavelet features in conjunction with SVM classification. The proposed features outperformed the baseline Mel-frequency cepstral coefficients (MFCC) feature. The performance of newly formulated features is better than other existing methodologies tested with different language databases.