Overview

As of now, there are approximately 50 million individuals worldwide living with mild cognitive impairment (MCI) and dementia, a number projected to triple within the next 20 years¹. This necessitates the development of accessible and cost-effective tools that can be deployed for early detection.

Research has shown that changes in language function and brain structure can be early indicators of cognitive impairment^2,3. Thus, we are developing multi-modal models that enhance prediction and detection of dementia using speech and imaging markers extracted via novel transformer-based approaches.

Aims

Based on our preliminary work, we can feasibly use our methods to pursue the following specific aims:

Evaluate Individual Predictive Capability: Develop and validate three deep learning models using linguistic, acoustic, and imaging markers to predict dementia or MCI with over 85% accuracy.
Identify AD Markers: Derive and analyze linguistic, acoustic, and imaging markers to find associations with Alzheimer’s disease biomarkers.
Determine Prediction Power: Construct and test multi-modal model using associated AD markers extracted with CaLM and DemImNet

Figure 1. Custom-tailored transformers and their architecture

Figure 2. Proposed model framework

Extraction of Acoustic, Linguistic, and Neuroimaging Features

Acoustic Deep Feature Extraction via Librosa

We extract acoustic features from segmented processed audio samples. To do so, we utilize Librosa, a Python-based audio analysis library, for deep feature extraction of the following acoustic features: Root Mean Square Error (RMSE), Speech Rate, Harmonics-to-Noise Ratio (HNR), Formants, Chromogram, Mel Frequency Cepstrum Coefficients (MFCCs), Mel Spectrogram, and Spectral Centroid.

Dementia Image Network (DemImNet)

Our transformer-based model Dementia Imaging Network (DemImNet) is utilized to extract neuroimaging features from raw imaging data. The features extracted are postmortem MRI and Tissue Cell detection.

Cognition Assessment Language Model (CaLM)

Our transformer-based Cognition Assessment Language Model (CaLM) will be utilized to extract linguistic features from our segmented audio samples. The audio samples will be taken from recorded interviews and are segmented and transcribed. CaLM will then give numerical features to the linguistic features of grammatical errors, word length, word embedding, and sentiment.

Acknowledgements

We thank the participants and staff of the Aging and Disability Resource Connection (ADRC) and The 90+ Study for providing us with spontaneous speech data. We would also like to thank University of California, Irvine for their support in enabling this project!

References

ADI – Dementia statistics. (n.d.). Retrieved May 31, 2024, from https://www.alzint.org/about/dementia-facts-figures/dementia-statistics/
McCullough KC, Bayles KA, Bouldin ED. Language Performance of Individuals at Risk for Mild Cognitive Impairment. Journal of Speech, Language, and Hearing Research. 2019 Mar 25;62(3):706–22. https://doi.org/10.1044/2018_JSLHR-L-18-0232
Mueller KD, Koscik RL, Hermann BP, Johnson SC, Turkstra LS. Declines in Connected Language Are Associated with Very Early Mild Cognitive Impairment: Results from the Wisconsin Registry for Alzheimer’s Prevention. Front Aging Neurosci. 2018 Jan 9;9. https://doi.org/10.3389/fnagi.2017.00437

Sajjadi Lab

Deep Learning-Based Dementia Detection using Digital Markers