delicious

Emotion Recognition by Speech Signals

Een murb'ed feed, bijna 14 jaren geleden geplaatst onder communication, emotion, speech, imported, citeulike, automatic & recognition.

For emotion recognition, we selected pitch, log energy, formant, mel-band energies, and mel frequency cepstral coefficients (MFCCs) as the base features, and added velocity/ acceleration of pitch and MFCCs to form feature streams. We extracted statistics used for discriminative classifiers, assuming that each stream is a one-dimensional signal. Extracted features were analyzed by using quadratic discriminant analysis (QDA) and support vector machine (SVM). Experimental results showed that pitch and energy were the most important factors. Using two different kinds of databases, we compared emotion recognition performance of various classifiers: SVM, linear discriminant analysis (LDA), QDA and hidden Markov model (HMM). With the text-independent SUSAS database, we achieved the best accuracy of 96.3\% for stressed/neutral style classification and 70.1\% for 4-class speaking style classification using Gaussian SVM, which is superior to the previous results. With the speaker-independe…

Lees meer op de oorspronkelijke bron.

Op de hoogte blijven?

Maandelijks maak ik een selectie artikelen en zorg ik voor wat extra context bij de meer technische stukken. Schrijf je hieronder in:

Mailfrequentie = 1x per maand. Je privacy wordt serieus genomen: de mailinglijst bestaat alleen op onze servers.