Each year, about 5,000 people are diagnosed with amyotrophic lateral sclerosis (ALS) in the United States. ALS, also known as Lou Gehrig’s disease, is a neurodegenerative disease that deteriorates individuals’ ability to walk, speak and, ultimately, breathe.
In combination with rising numbers of other diagnoses that impact the voice, including throat cancer, cerebral palsy and Parkinson’s disease, individuals worldwide are losing pieces of their vocal autonomy every day.
Researchers around the world are working to help these individuals regain function and slow the disease’s progression. ALS is characterized by rapid progression, and options for helping those affected are limited. However, companies have been able to find a way to help give individuals their voice back.
Technique of Vocal Banking
Voice banking is a technique that utilizes the recording of a person’s voice to gather the nuances and complexities of it and create a personalized synthetic voice with text-to-speech software. A similar process, called message banking, allows individuals to record meaningful messages and expressions in a preserved form of the person’s original voice.
Prior to AI, voice banking was a useful but impersonal tool.
Beginning in the ’90s, voice banking was slow, required several hours and upwards of 3,000 recorded sentences. And that wasn’t counting the re-recordings that helped develop a synthetic voice over a period of a couple months. As recently as 2018, ALS News Today reported that the process would take an average of three months to complete.
The parameters for voice banking systems have thankfully changed over time, as some systems now require as few as 50 recorded sentences. Projects such as I Will Always Be Me, a partnership between Rolls-Royce and the U.K.-based Motor Neurone Disease Association, takes a new approach to voice banking: Individuals can record their voice in just 20-25 minutes.
This project is available internationally and allows individuals to record themselves reading the short illustrated book I Will Always Be Me. This story speaks to people losing their independence and helps them convey more emotions, which makes the recordings more personalized and accurate to their voice.
The mechanics of the text-to-speech system however have not always been the best. While these systems are able to identify language nuances and unique vocal qualities, they would use unit selection to cut and paste single units, or phonemes, in order to create words and sentences. Cutting and pasting phonemes in such a way results in choppy and jarring sentence structure wrapped up in a robotic, generic-sounding voice.
AI software smooths the text-to-speech choppiness, providing a custom synthesized voice based on the voice and message bank samples.
Accessibility
Voice banking, as well as message banking, work best when they take place soon after diagnosis to preserve the strongest version of a person’s voice.
For individuals who have already lost their voice, there are alternate options for voice banking. Unfortunately, a system that can create a voice from past recordings has yet been developed; however, some services allow individuals to have a family member or friend record their voice and a proxy synthetic voice for them. In some cases, the loved one’s voice can be altered to sound similar to the individual’s voice.
For years, the process of voice banking was expensive and time-consuming, making it inaccessible to many individuals who needed the service. Organizations with voice banking capabilities, such as Team Gleason Foundation, a nonprofit specifically dedicated to voice banking for people with ALS, had a low number of service requests.
However, with the help of AI software, accessibility to voice banking services has drastically increased as the process of developing a synthetic voice has become simpler. Team Gleason Foundation alone saw its service requests increase from 172 requests in 2017 to more than 1,200 in 2022, nearly a six-fold increase.
AI has also helped reduce the price of services through its acceleration of the unit selection process. Acapela Group, which formerly charged $3,000 for its voice banking service, now charges $999 and offers the shortest time frame with 13 language options.
Recording the audio clips has also become easier over time with a variety of voice banking platforms, including Acapela and Model Talker, that bring the recording space much closer to home. These platforms help individuals record their voice from their own computer or by connecting them with universities and other organizations that offer recording services.
There are also free apps, such as the Message Banking App created by speech therapist and Augmentative Alternative Communication Specialist Amy Roman, which guides people with ALS through the process of message banking.
These systems of voice banking are continuing to expand language capacities, growing portfolios in numerous languages and text-to-speech dialects, regional voices and accents.
Now, individuals who are losing their voice can protect an important part of themselves. The unpredictability of ALS’s progression can be a source of fear, as it can lead to rapid degeneration or a prolonged fight, but having the ability to express one’s individuality can be a gift.
Jane Dimel is an editorial assistant at CityScene Media Group. Feedback is welcome at feedback@cityscenemediagroup.com.