Main Second Level Navigation
Breadcrumbs
- Home
- News & Events
- Recent News
- Designing Voices: Rupal Patel, MHSc ’95, PhD ’00
Designing Voices: Rupal Patel, MHSc ’95, PhD ’00

Rupal Patel, MHSc ’95, PhD ’00 is an internationally renowned expert in speech motor control and assistive communication technology. She is CEO & Founder of VocaliD Inc., a company that creates unique and personalized digital voices using artificial intelligence.
I am a scientist with the heart of an engineer. I’ve always loved building things, and I think this is what motivated me to think about how we could build voices that suited someone’s vocality – their vocal personality.
Renowned speech scientist Rupal Patel took a leave from her tenured professorship with Northeastern University to launch VocaliD. She spoke with the Faculty of Medicine’s Karen Lee about her entrepreneurial journey.
What inspired you to launch VocaliD?
I direct an interdisciplinary lab at Northeastern University in Boston. Our focus is on understanding speech motor control in individuals with speech impairments and finding ways to leverage their abilities through novel assistive technologies. My doctoral work at the University of Toronto laid the groundwork for this research. My research revealed that even those individuals with severe impairments could control the melodic aspects of their voice. These individuals would communicate using a synthetic voice through text-to-speech devices. I found it puzzling that so many individuals were using the same or similar computerized voice — it’s odd to have a young girl and an old man communicating with the same voice. I thought there had to be a way to craft a more personalized voice.
How has VocaliD changed the process of building a digital voice?
Before VocaliD came along, a voice actor needed to read thousands of sentences in a recording studio. A team of linguists and scientists would then process the audio to build a synthetic voice. It was time-consuming and expensive — taking anywhere between 40 and 60 hours per voice, which could be tens to hundreds of thousands of dollars.
At VocaliD, we build unique voices for those who cannot speak by recording samples of whatever sound that a person can still make. Once we have a voice sample, we search our voice database to find a surrogate talker. This person has similarities in age, gender and/or geographic location etc. to the recipient. Our algorithms then take several hours of recordings from the surrogate speaker and blend them with the original sample of the recipient to create a unique blended voice. The output is a digital voice created specifically for our client for use on his/her device. VocaliD currently offers personalized voices for those with a speech impairment at approximately $1,500— which is heavily subsidized by grants and significant cost savings for the end user.
How do you find voice surrogates?
With the widespread adoption of consumer technology and ease of access to recording devices, we decided to build an online platform to crowdsource voices. Everyday citizens record scripts of short phrases and stories on the voicebank platform. This allows people of all ages from around the world to contribute and helps minimize the costs of the technology to end users. Our engineers and algorithms use the recordings to learn from and transform the speaking patterns into a text to speech engine. Today the Human Voicebank is a community of 26,000 voice donors across the world ranging in age from six to 91 years old.
What drew you to this type of work?
During my undergraduate studies, my brother-in-law suggested I shadow a speech pathologist. That experience inspired me to pursue a Master’s in Speech-Language Pathology. After working as a speech and language therapist for a couple years, I returned to U of T to pursue my doctorate with a focus on speech technology. Since then, my research has always been at the intersection of basic and applied science technology. I am a scientist with the heart of an engineer. I’ve always loved building things, and I think this is what motivated me to think about how we could build voices that suited someone’s vocality – their vocal personality.
What is the impact of giving someone a voice?
One of our first recipients in 2015 was a gentleman with ALS who had lost his ability to speak. We had built several versions of his voice because back then we were still learning what it took to create a unique voice that also fit the user. When we played him the first sample of the voice we had built, he politely nodded and indicated it was okay. When he heard the last of the three voices, his entire body went into shakes, his wife started crying and they both started talking about the fact that the voice was him. Seeing his reaction and what it meant to him and his family was extremely gratifying. With VocaliD we are not just creating voices, we are giving our clients a means for communication and self-expression.
What advice would you give to students?
Follow your passion, don’t let your degree define what you do. We need more people to break through their silos and find ways to bridge gaps between our knowledge bases.
By applying speech science to this venture, I am discovering new problems and new perspectives which feeds back to research. Innovation rooted in academia has the potential to have an incredible impact but the path to actualizing it isn’t always clear. Much like any experiment we run in the lab, you start with a hypothesis, you test it and you learn. The only difference is the iteration cycle is much faster in ventures, but when backed by science there is nothing that can stop us. My time at U of T gave me the confidence that I could apply my knowledge in many different ways, and I think that’s very powerful.
News
