Nov 2, 2022

Using Voice and Machine Learning to Diagnose Disease

Research, Faculty & Staff, Partnerships
Professors Jordan Lerner-Ellis and Frank Rudzicz are pictured against a blue graphic background.
Professors Jordan Lerner-Ellis and Frank Rudzicz.

Professors Jordan Lerner-Ellis and Frank Rudzicz, both of the department of laboratory medicine and pathobiology, are part of a group of researchers that received new funding to use AI to tackle complex biomedical challenges. The $14-million grant was awarded to the group, which spans ten universities, by the National Institutes of Health’s Bridge2AI program.  

The funded project will be led by researchers at the University of South Florida and Weill Cornell with a goal to build an ethically sourced database of diverse human voices. Using this data, machine learning models will be trained to detect diseases by identifying changes in the human voice, which could provide doctors with a low-cost diagnostic tool to be used alongside other clinical methods. Changes in a person’s voice can be indicative of different types of conditions like Alzheimer's disease or autism. 

The group of medical, voice, AI, engineering, and ethics experts will study disease categories such as: 

  • Voice disorders: (laryngeal cancers, vocal fold paralysis, benign laryngeal lesions) 

  • Neurological and neurodegenerative disorders (Alzheimer’s, Parkinson’s, stroke, ALS) 

  • Mood and psychiatric disorders (depression, schizophrenia, bipolar disorders) 

  • Respiratory disorders (pneumonia, COPD) 

  • Pediatric voice and speech disorders (speech and language delays, autism) 

Lerner-Ellis is an associate professor of laboratory medicine and pathobiology and the co-head and co-director of the Advanced Molecular Diagnostics Laboratory at Mount Sinai Hospital. The lab offers clinical genetic testing services for the province of Ontario. Lerner-Ellis' research team has a long-standing interest in understanding genomic data and the human genome. 

Lerner-Ellis is the genomic cohort lead and his role in the project will be to work with clinics in Toronto and enroll patients to collect samples for the purpose of genome sequencing.  

“Our aim is to build a publicly available database that researchers from around the world can use to study the relationship between genetics and disease and voice disorders; this approach could eventually lead to the development of valuable diagnostic tools. We will look at the relationship between genetic variation and the voice. Where there is a known causal molecular biomarker, such as for Alzheimer's, we can return this information to clinicians and research participants,” explains Lerner-Ellis. 

His team will play a critical role in interacting with participants. “We do a lot of genome sequencing and returning these results to participants can provide important information such as facilitating a diagnosis and helping to inform prognosis and patient management. Our team will be directly engaged in patient enrollment and consent, sample collection, genome sequencing and pre- and post-test genetic counselling,” he says. 

Rudzicz is an associate professor of laboratory medicine and pathobiology and computer science, and scientist at the Li Ka Shing Knowledge Institute at Unity Health Toronto. An expert in machine learning in healthcare, especially in natural language processing, speech recognition and surgical safety, he has been working on speech analysis in aphasia and dementia. 

Leading the neurological cohort, Rudzicz is part of the data acquisition team and will focus on behavioral, speech and language. 

“When datasets are too small, the generalizability of models trained on those datasets becomes a bit of a problem. With this study, we’ll be able to look at a broad range of diseases and with the data we'll be able to get a much deeper sense of the phenomena,” says Rudzicz. 

“This project is ambitious and very large. We cover everything from natural data collection and analysis to ethical considerations. We also overcome one of the key challenges in this kind of science, which is siloed, private data sets. If data are private, it’s not really possible to reproduce the results so, in the interests of science, having clean, large, public data sets which can be used as a baseline for many different methods is so important. I'm really thrilled about that aspect of this project.”  

Read more about the project on the University of South Florida website.