Next Generation Diagnosis of Voice Pathology Using Ensemble Learning Models


Vocal problems are common in ethnic groups that require a lot of vocal use in occupations. Because the vocal cords are located deep in the throat, special throat endoscopes and experienced specialists are needed to make a correct diagnosis and determine the follow-up treatment strategy.


World premiere 1: Deep learning to recognize pathological voices

The deep learning algorithm outperforms other classifiers in previous studies. Under the same experimental conditions and the same public database of the Massachusetts Eye, Ear, Nose and Throat Institute, the detection rate can be further increased from 98% to 99.14%, which is the best performance for voice disease detection in the literature.


World premiere 2: Automatic classification of voice diseases

Our team published the first academic paper on the automatic classification of voice diseases, which broke through the past simply distinguishing normal abnormalities. Using the patient's demographic characteristics (such as age and gender), personal medical history (such as smoking and drinking), and voice-related symptoms, the proposed algorithm classifies three common voice diseases (tumor, voice misuse, vocal cord paralysis) with 83.0% accuracy rate.


World premiere 3: Multi-modal learning (medical history + voice)

Our team set a precedent for domestic and foreign research, successfully combining dynamic acoustic signals and static medical history records, and proposed a multi-modal fusion model to capture important information. The classification accuracy rate achieves 87.26% ± 2.23%.


 Booth No. L826 


 Look for: chip designer, IoT integration system, computerized auxiliary diagnosis system, health inspection agency