background
logo
ArxivPaperAI

A generative framework for conversational laughter: Its 'language model' and laughter sound synthesis

Author:
Hiroki Mori, Shunya Kimura
Keyword:
Electrical Engineering and Systems Science, Audio and Speech Processing, Audio and Speech Processing (eess.AS)
journal:
Proc. Interspeech 2023 (2023) 3372-3376
date:
2023-06-05 16:00:00
Abstract
As the phonetic and acoustic manifestations of laughter in conversation are highly diverse, laughter synthesis should be capable of accommodating such diversity while maintaining high controllability. This paper proposes a generative model of laughter in conversation that can produce a wide variety of laughter by utilizing the emotion dimension as a conversational context. The model comprises two parts: the laughter "phones generator," which generates various, but realistic, combinations of laughter components for a given speaker ID and emotional state, and the laughter "sound synthesizer," which receives the laughter phone sequence and produces acoustic features that reflect the speaker's individuality and emotional state. The results of a listening experiment indicated that conditioning both the phones generator and the sound synthesizer on emotion dimensions resulted in the most effective control of the perceived emotion in synthesized laughter.
PDF: A generative framework for conversational laughter: Its 'language model' and laughter sound synthesis.pdf
Empowered by ChatGPT