APSIPA ASC 2015 tutorial

Spoofing and Anti-Spoofing: A Shared View of Speaker Verification, Speech Synthesis and Voice Conversion

Introduction

Automatic speaker verification (ASV) offers a low-cost and flexible biometric solution to person authentication. While the reliability of ASV systems is now considered sufficient to support mass-market adoption, there are concerns that the technology is vulnerable to spoofing, also referred to as presentation attacks. Spoofing refers to an attack whereby a fraudster attempts to manipulate a biometric system by masquerading as another, enrolled person. On the other hand, speaker adaptation in speech synthesis and voice conversion techniques attempt to mimic a target speaker's voice automatically, and hence present a genuine threat to ASV systems.

The research community has responded to speech synthesis and voice conversion spoofing attacks with dedicated countermeasures which aim to detect and deflect such attacks. Even if the literature shows that they can be effective, the problem is far from being solved; ASV systems remain vulnerable to spoofing, and a deeper understanding of speaker verification, speech synthesis and voice conversion will be fundamental to the pursuit of spoofing-robust speaker verification.

While the level of interest is growing, the level of effort to develop spoofing countermeasures for ASV is lagging behind that for other biometric modalities. What's more, the vulnerabilities of ASV to spoofing are now well acknowledged. A tutorial on spoofing and anti-spoofing from the combined perspective of speaker verification, speech synthesis and voice conversion is much needed. The tutorial will cover knowledge in, not only spoofing and countermeasures, but also fundamentals of speaker verification, speech synthesis and voice conversion.

The speakers have led the research community in anti-spoofing for ASV since 2013, have jointly authored a growing number of conference papers, book chapters and the latest survey paper published in Speech Communications in 2015. Between them they have organised two special sessions and one evaluation/challenge (http://www.spoofingchallenge.org/) on the same topic.

Slides

Please download the latest slides via this link: [PDF]

Recommended papers

Zhizheng Wu, Tomi Kinnunen, Nicolas Evans, Junichi Yamagishi, Cemal Hanilci, Md Sahidullah, Aleksandr Sizov, "ASVspoof 2015: the First Automatic Speaker Verification Spoofing and Countermeasures Challenge", Interspeech 2015 [PDF]

Zhizheng Wu, Nicholas Evans, Tomi Kinnunen, Junichi Yamagishi, Federico Alegre, Haizhou Li, "Spoofing and countermeasures for speaker verification: a survey", Speech Communication, Volume 66, Pages 130–153, 2015 [PDF]

Tomi Kinnunen, Haizhou Li, "An Overview of Text-Independent Speaker Recognition: from Features to Supervectors", Speech Communication, Volume 52, Pages 12–40, 2010

Keiichi Tokuda, Yoshihiko Nankaku, Takechi Toda, Heiga Zen, Junichi Yamagishi, Keiichiro Oura. "Speech synthesis based on hidden Markov models", Proceedings of the IEEE 101, no. 5 (2013): 1234-1252.

Speakers

Zhizheng Wu, University of Edinburgh, UK
Tomi Kinnunen, University of Eastern Finland, Finland
Nicholas Evans, EURECOM, France
Junichi Yamagishi, University of Edinburgh, UK and National Institute of Informatics, Japan