WIP Research

Emotional-Aware-Reasoning

Audio-augmented emotion-aware language models with cross-speaker generalization

// DESCRIPTION

Investigating how emotional audio cues can improve language model reasoning and empathy.

Research Focus

  • Multi-modal emotion recognition (text + audio)
  • Emotion-conditioned response generation
  • Cross-speaker generalization

Key Finding

Real emotional speech shows 48% better cross-speaker generalization compared to synthetic speech.

Datasets

  • RAVDESS: 24 speakers, 7 emotions
  • Synthetic: TTS-generated emotional speech

Experiments

  • Audio feature extraction (librosa)
  • Multi-modal fusion architectures
  • Emotion-guided decoding

// HIGHLIGHTS

  • 48% improvement finding
  • Multi-modal fusion research
  • Multiple experiments complete

TECH_STACK

Python PyTorch librosa Transformers Whisper

PROJECT_INFO

started: 2024-09-01
status: WIP
type: Research