Sunday, October 6, 2024
32 C
Brunei Town

Latest

As good as it gets

THE WASHINGTON POST – Ron Brady was 52 years old when he was diagnosed with ALS, which stands for amyotrophic lateral sclerosis, a neurodegenerative disease that eventually causes most people to lose their ability to speak, walk or breathe.

Now, at 55, he can’t swallow food, and it’s getting harder to brush his teeth and put on clothes. He likes to crack jokes, but his speech is slurred to the point where few understand him. But he has not lost his voice.

That’s because he preserved his voice with a company called Voice Keeper, which is one of several companies using artificial intelligence (AI) to “bank” people’s voices while they are still able to speak and re-creates those voices for text-to-speech software.

Voice banking used to be expensive and time-consuming, but AI has made it more accessible to people with conditions that could impact their ability to speak, such as ALS, throat cancer, cerebral palsy and Parkinson’s disease.

Patients said having a computer-generated voice that sounds like their real voice has given them a greater sense of confidence and connection to the world around them. Brady’s synthetic voice isn’t a perfect match – his speech was already impaired when he recorded himself. But it has the same relaxed, deep tone, which he jokingly calls “suave”.

“My favourite thing to say is any corny dad comment that will make my wife or adult children laugh,” he said. To Brady, getting his voice back felt like getting parts of himself back: the school administrator who commanded a room with confidence, the gregarious, talkative father, and the first college graduate in his family, whose neutral American accent was very different from the Caribbean accent of his immigrant parents.

FROM LEFT: Anna Paula Pereira Hülle Mateus with daughters Isadoraand Heloisa in their home in Lafayette, California, United States; and Ron Brady’s nurse helps clean parts of his feeding tube. PHOTOS: THE WASHINGTON POST

Brian Wallach with his wife, Sandra Abrevaya, and their dog, Moon

The use of AI has driven a surge in voice banking, particularly among ALS patients. In 2017, Team Gleason Foundation, a non-profit that funds voice banking for people with ALS, got 172 requests for the service. In 2022, it received more than 1,200 requests. In the United States (US), an average of 5,000 people are diagnosed with ALS each year.

Capturing human speech is incredibly complex. Previously, a person might have to record 1,000 to 6,000 sentences to capture every possible sound in a language. The process typically took eight to 30 hours. Those recorded sounds then went into a database, and the software would rearrange the sounds to form words and phrases.

The method is known as unit selection, and the results were “choppy”, said Director of the Nemours Center for Pediatric Auditory and Speech Sciences Tim Bunnell.

“It’s intelligible, but it’s very jarring,” Bunnell said. “Our unit selection voices don’t sound as good as a human voice.”

His research laboratory has transitioned from older methods of speech synthesis to newer methods, such as those using AI.

To create a digital voice, AI software analyses a person’s speech sample and then quickly scours a large database to find people speaking in similar ways. It finds patterns in how voices sound and creates a digital voice to match an individual speaker. Most companies now only need a few hundred sentences to get enough data. But some, like Acapela Group, which partners with Team Gleason Foundation, have algorithms that can build a voice from just 50 sentences.

The use of AI has also made voice banking more affordable. Acapela Group charged USD3,000 when the company relied on unit selection, but with AI, the cost is now USD999.

Other companies offer the service for as low as USD300. Voice banking is not covered by insurance, but most companies don’t charge people unless they start using their synthesised voices.

John M Costello, who has worked with thousands of patients as the Director of the Augmentative Communication Programme at Boston Children’s Hospital, recommends patients work with a speech language pathologist to figure out which product best matches their capabilities and needs.

He has noticed patients with realistic voices have more meaningful connections with loved ones. “Personal voice is so important to our relationships,” he said. “There is a psychological response.”

Research shows hearing one’s mother’s voice releases similar levels of oxytocin as getting a hug from her. Oxytocin is a social bonding hormone linked to lower levels of stress and anxiety. Another study found that self-agency is enhanced by hearing one’s own voice.

The ease of the newer technology convinced Anna Paula Pereira Hülle Mateus, 51, of Lafayette, California, US to try it. When she was diagnosed with ALS in July 2022, she was hesitant to spend her energy focussing on what she might lose.

She changed her mind once a doctor told her that voice banking would take about an hour and could ensure she would always have her voice. “Now I’m very glad that I did it, because I feel that my speech is getting worse and worse,” she said.

But the company she used, Acapela Group, doesn’t offer the service in Portuguese, which is her native language. To offer voice banking in different languages, companies need to develop separate algorithms for each language.

The fact that Pereira Hülle Mateus couldn’t preserve her voice in Portuguese saddens her because her mother and many of her close family and friends only speak Portuguese.

In English, she sometimes struggles with finding words to express herself. But when she speaks Portuguese, her voice rises and falls like musical notes.

Pereira Hülle Mateus’s synthetic voice in English is a little flatter but still captures her distinct Brazilian accent. However, she doesn’t plan to listen to her synthetic voice until the moment she needs to use it – which she hopes will never come.

Each company has its own method for capturing speech, using different sentences and algorithms. So if someone banks their voice with one company, then loses the ability to speak, they could be stuck with that company, said Executive Director of Team Gleason Foundation, the nonprofit that helps ALS patients Blair Casey.

He has been pushing for companies to create a standardised set of phrases that can be used with any of their algorithms, so that customers can comparison shop. He’s also pushing companies to give customers their original recordings so they can use them with other companies in the future.

“If something better came out, wouldn’t you want to be able to try it?” he asked. “And if you don’t have access to those phrase sets, you can’t.” A prominent ALS activist and former federal prosecutor who lives in Illinois Brian Wallach, 42, was diagnosed with ALS when he was 37, the same day his youngest daughter came home from the hospital.

Over the years, his voice has transformed from forceful and clear to mumbled murmurs.

When he played his synthetic voice for the first time to his family, it was so accurate his wife burst into tears, he said. Meanwhile, his youngest daughter, who had never heard his voice pre-ALS, asked, “Is that you, Daddy?”

“I said back to her, ‘It is. My voice has changed a lot, but this is what I used to sound like,'” he said.

Although he likes his synthetic voice, it doesn’t pronounce his wife’s name, Sandra, correctly. The synthetic voice also can’t express the emotions he wants to convey when he talks to his two young daughters. Typing what he wants to say with his synthetic voice is a slow process because the muscles in his hands have weakened.

Because of the technology’s limitations, Wallach tends to only use his synthetic voice when he is out in public, at a larger gathering with friends, or too tired to speak. His family is still mostly able to understand him.

Once ALS patients lose their ability to use their hands, they must use their eyes to type – slowing down conversation even further. This was the case for Ruth Brunton, of Rogers, Arkansas. She was diagnosed with ALS in March 2021 and by Christmas that year, she lost her ability to speak. She banked her voice immediately after her diagnosis, but the company she worked with used unit selection technology, the older technology that can sound choppy or more robotic. Though she spent about a month recording 3,000 sentences, she wasn’t happy with the final result.

So, she was stuck using a generic voice with an American accent from Microsoft called “Heather”. But the voice failed to capture her soft-spoken British accent from Ormskirk, England, which her husband jokingly called “posh” compared to his thick “scouse” accent from Liverpool.

In the voice of “Heather”, Ruth, a pragmatic, strong-willed person who was once the Chief Executive Officer of a nonprofit that helped struggling families, started to retreat into a shell, said her husband David Brunton. Their flirtatious banter stopped completely, and Ruth participated less in group conversations. Even saying “I love you” seemed to mean less in a voice that wasn’t her own.

“She was talking because she had to, not because she wanted to,” he said.

After six months, they tried again – Ruth was able to get her original recordings back and gave them to a different company that used AI technology. Upon hearing the new voice, both Ruth and David got emotional – it felt like a part of Ruth had come back.

“I was taken aback how much it meant to me to have a voice that actually sounded like me,” Ruth said in an interview in December.

“It may sound silly, but having my own voice increased my self-confidence,” she added.

Suddenly, small things they took for granted before, like having quiet chats together before bed, or reading books to their five grandchildren, took on a new significance.

Shortly after the holiday, Ruth got COVID. Her already limited breathing got even more difficult, and on the morning of February 10, nine days before their 40th wedding anniversary, she passed away. David had held her hand all night.

“We’ve had two years of saying goodbye,” David said. “We agreed we were going to leave nothing unsaid.”

spot_img

Related News

spot_img