Devices To Help Paralyzed People Speak Achieve Record Speeds

Two new studies have showcased progress in brain-computer interfaces (BCIs), helping two patients who were unable to speak to form sentences faster than previous technology could.

In one study, a patient with amyotrophic lateral sclerosis (ALS) had microelectrode arrays implanted in her brain in areas related to the production of speech. The device was trained using neural activity as the patient tried to speak, with the dataset containing 10,850 sentences in total.

Speech was decoded at 62 words per minute, 3.4 times as fast as the previous record. The authors write in the study that the patient “achieved a 9.1 percent word error rate on a 50-word vocabulary (2.7 times fewer errors than the previous state-of-the-art speech BCI) and a 23.8 percent word error rate on a 125,000-word vocabulary (the first successful demonstration, to our knowledge, of large-vocabulary decoding).”

"This system is trained to know what words should come before other ones, and which phonemes make what words," Dr Frank Willett, first author of the study, told the BBC. "If some were wrongly interpreted, it can still take a good guess."

The patient told the BBC that the advances could help patients “perhaps continue to work, maintain friends and family relationships.”

However, the authors of the study stress that “it does not yet constitute a complete, clinically viable system”. They note that work needs to be done to reduce the amount of time needed to train the device, which was 140 minutes per day on average for eight days in this study.

They also note that “a 24 percent word error rate is probably not yet sufficiently low for everyday use (for example, compared with a 4–5 percent word error rate for state-of-the-art speech-to-text systems,” although “with further language model improvements and, when mitigating the effect of within-day non-stationarities, we were able to reduce word error rate to 11.8 percent in offline analyses."

The other study involved a patient who had a brainstem stroke years prior. The authors write that they “trained and evaluated deep-learning models using neural data collected as the participant attempted to silently speak sentences,” and say that the decoders “reached high performance” with under two weeks of training. They write that “For text, we demonstrate accurate and rapid large-vocabulary decoding with a median rate of 78 words per minute and median word error rate of 25 percent.”

According to an accompanying News and Views article, when translated to synthesized speech, the model had an error rate of 54.4 percent for a 1,024-word vocabulary and 8.2 percent for a 119-word vocabulary. The authors personalized the patient’s synthesized speech to resemble her own, “conditioning the model on a short clip of our participant’s voice extracted from a pre-injury video of her.”

The researchers also created a digital avatar to reproduce facial expressions, using “an avatar-animation system designed to transform speech signals into accompanying facial-movement animations for applications in games and film (Speech Graphics).”

The authors of the study note that the results are only from one participant, and would need to be validated in other patients with varying degrees of paralysis.

In the supplementary material of the study, the patient writes that “the simple fact of hearing a voice similar to your own is emotional. Being able to have the ability to speak aloud is very important. The first 7 years after my stroke, all I used was a letterboard.”

“My husband was so sick of having to get up and translate the letterboard for me. We didn’t argue because he didn’t give me a chance to argue back. As you can imagine, this frustrated me greatly! When I had the ability to talk for myself was huge! Again, the ideal situation would be for it to be wireless. Being able to speak free-style would be ideal also.”

The patient says that her “moonshot” is to “become a counselor and use the system to talk to my clients. I think the avatar would make them more at ease.”

Both the first and second studies are published in Nature.