Gender-ambiguous voice generation through feminine speaking style transfer in male voices

Maria Koutsogiannaki, Shafel Mc Dowall, Ioannis Agiomyrgiannakis
Electrical Engineering and Systems Science, Audio and Speech Processing, Audio and Speech Processing (eess.AS), Sound (cs.SD)
2024-03-12 00:00:00
Recently, and under the umbrella of Responsible AI, efforts have been made to develop gender-ambiguous synthetic speech to represent with a single voice all individuals in the gender spectrum. However, research efforts have completely overlooked the speaking style despite differences found among binary and non-binary populations. In this work, we synthesise gender-ambiguous speech by combining the timbre of a male speaker with the manner of speech of a female speaker using voice morphing and pitch shifting towards the male-female boundary. Subjective evaluations indicate that the ambiguity of the morphed samples that convey the female speech style is higher than those that undergo pure pitch transformations suggesting that the speaking style can be a contributing factor in creating gender-ambiguous speech. To our knowledge, this is the first study that explicitly uses the transfer of the speaking style to create gender-ambiguous voices.
PDF: Gender-ambiguous voice generation through feminine speaking style transfer in male voices.pdf
