Giving voice to people living with MND: voice banking and the generation of personalised synthetic voices

Today, on Global ALS/MND Awareness Day, we are proud to highlight one of our ongoing research projects: Voice banking and the generation of personalised synthetic voices.

This blog post was prepared for the MND Association’s “A Blog a Day” and is also available at http://wp.me/pNdD8-L3

 

Ring ring….ring ring….

“Hello?”

“Hi there, it’s me.”

“Oh hello dear, how nice to hear from you!”

 

Sound familiar? How many of your friends or family could you recognise from a few words of their voice? Two, five, ten or more?

It may have never previously occurred to you, but our voices are as unique as our face shape, our walk and even our eyes. A person’s voice is an essential component of his or her identity.

This is part of the reason why it can be so distressing for some people living with MND—or indeed any neurodegenerative condition—if they start to lose their voice. For some, the early symptoms of altered speech or dysarthria (dis-arth-ree-ah) can lead rapidly to an inability to move the muscles in the mouth and tongue that are required for speech.

It is only when speech becomes difficult that it becomes clear just how valuable it is as a communication tool, and how much we take it for granted. Even asking for a cup of tea can become a time-consuming effort, let alone holding a conversation.

Alternative communication

It is in these situations that people often rely on “Augmentative and Alternative Communication” (AAC) to get their message across. AAC can range from using simple strategies such as gestures or writing things with pen and paper, through to voice output communication aids (VOCA) that generate synthetic speech from text inputted on a keyboard or via an eye-tracking system.

However many people complain that the synthetic voices pre-installed in these devices don’t represent the user’s identity. The British physicist Stephen Hawking, probably the most famous AAC user and person with MND alive today, is now so closely associated with his American voice that it has become part of his identity, and he has declined offers to update it. But this is not the case for everyone. While the quality of synthetic voices available on many AAC devices has improved greatly in the last few years, users are often limited to a choice of only a few voices, of which only a couple might be British, let alone representative of their own accent.

Voicebanking: speaking with your own voice

_DSR4339_200But what if it were possible to speak in your own voice through an AAC device? This idea sparked a research project at the Euan MacDonald Centre for MND Research in Scotland. Clinical researchers teamed up with speech and language therapists and speech scientists at the University of Edinburgh’s Centre for Speech Technology Research to try to deliver personalised synthetic voices for people living with MND to use in AAC devices. The voicebanking project was born, and its development has been part-funded by the MND Association.

Ideally, we record a person’s voice soon after diagnosis, and before speech has become affected. People are asked to read aloud around 400 sentences (which takes about an hour) whilst being recorded in our purpose-built sound-proof room in the Anne Rowling Regenerative Neurology Clinic in Edinburgh. The sentences have been chosen to capture all the speech sounds of English in all the different possible combinations. While 400 sentences is an ideal number, we can create a synthetic voice from as little as 100 sentences if people aren’t able to manage the 400 mark. This voice recording is then “banked” and stored ready to create a synthetic voice for a communication aid if, and when, that person needs one. Using software developed by speech scientists, all the parameters of that unique voice can be automatically analysed and synthetically reproduced in a process called “voice cloning”.

Voice-mixing

This is where “donor” voices come into play. During the voice cloning process the synthetically reproduced parameters of a patient’s voice are combined with those of healthy donor voices. Features of donor voices with the same age, sex and regional accent as the patient are pooled together to form an “average voice model” (AVM), which acts as a base on which to generate the synthetic voice. It’s a bit like going to the paint-mixing counter in a DIY shop, taking a 5 litre tin of light gloss base paint to the counter and mixing in your personal colour of choice (Sumptuous Plum for example…). It is the use of these donor voices that means we can use just a short recording from the patient, as the bulk of the speech data has been collected in the donor AVM or “base paint”.

What about people who are already beginning to lose their voice?

The really clever bit happens if a person comes in to record his or her voice once there is already mild to moderate dysarthria. It is possible for us to “repair” the voice in the synthesis process using more of the donor AVM to patch the damaged elements of the voice (adding more of the “base paint” to the personal colour). To date, we have recorded the voices of around 600 healthy individuals – old and young, male and female, and with a glorious range of regional accents. We are always trying to expand our bank of voices, as the bigger the pool of donors, the closer we can get to replicating the original voice of the person living with MND.

Developing a personalised synthetic voice app

Hear My Voice Edinburgh 350 imageRecently, we asked a small number of people living with MND to pilot a communication aid app we are developing for the iPad, into which we can install a personalised synthetic voice. Feedback about the intelligibility and similarity of the synthesised voice to the patient’s own was good, although there is still room for improvement. When we have refined and tested the app further, our long-term goal is to make it widely available for use as a communication aid,helping people living with MND retain their self-identity and dignity by keeping their own voice.

 

Note: At present we are unable to take requests for personalised synthetic voices for people living with MND. We are working hard, but we are still in the research phase and not able to provide a service yet. The Euan MacDonald Centre and MND Association will provide you with updates from the project at a later date. Thank you for your understanding.

However, if you live in or near Edinburgh, do not have any voice disorders and would like to record your voice as a healthy donor, please visit www.annerowlingclinic.com/voicebank-research