Develop Tools & CodePaper and LLMs

Integrating Emotion Recognition with Speech Recognition and Speaker Diarisation for Conversations

Two metrics are proposed to evaluate AER performance with automatic segmentation based on time-weighted emotion and speaker classification errors.

Tags:

Pricing Type

  • Pricing Type: Free
  • Price Range Start($):

GitHub Link

The GitHub link is https://github.com/w-wu/steer

Introduce

The repository “W-Wu/sTEER” contains code related to the “Integrating Emotion Recognition with Speech Recognition and Speaker Diarisation for Conversations” paper. The paper introduces a system that combines emotion recognition, speech recognition, and speaker diarisation in a jointly-trained model. The proposed evaluation metrics include Time-weighted Emotion Error Rate (TEER) and speaker-attributed Time-weighted Emotion Error Rate (sTEER). The code provides instructions and tools for data preparation, training, testing, and evaluation using Python, PyTorch, and Speechbrain. The paper details these processes and includes references for proper citation. Note that results might slightly differ due to PyTorch’s CTC loss function behavior.

Content

Two metrics proposed to evaluate emotion classification performance with automatic segmentation:


Integrating Emotion Recognition with Speech Recognition and Speaker Diarisation for Conversations

Related