A Path Signature Approach for Speech Emotion Recognition
Citation
Bo Wang, Maria Liakata, Hao Ni, Terry Lyons, Alejo J Nevado-Holgado, Kate Saunders. A Path Signature Approach for Speech Emotion Recognition. INTERSPEECH 2019 September 15–19, 2019
Abstract
Automatic speech emotion recognition (SER) remains a
difficult task within human-computer interaction, despite increasing
interest in the research community. One key challenge
is how to effectively integrate short-term characterisation
of speech segments with long-term information such as temporal
variations. Motivated by the numerical approximation theory
of stochastic differential equations (SDEs), we propose the
novel use of path signatures. The latter provide a pathwise definition
to solve SDEs, for the integration of short speech frames.
Furthermore we propose a hierarchical tree structure of path signatures,
to capture both global and local information. A simple
tree-based convolutional neural network (TBCNN) is used
for learning the structural information stemming from dyadic
path-tree signatures. Our experimental results on a widely
used benchmark dataset demonstrate comparable performance
to complex neural network based systems.
Index Terms: speech emotion recognition, path signature feature,
convolutional neural network
Published online at:
Collections
- Digital Medicine [59]