Improving deep neural networks to identify depression using neural architecture search

Trejo Rodríguez, Luis ÁngelHernández Silva, Erick2025-07-012025-06Hernández Silva, E. (2025). Improving deep neural networks to identify depression using neural architecture search [Tesis maestría]. Instituto Tecnológico y de Estudios Superiores de Monterrey. Recuperado de: https://hdl.handle.net/11285/703796https://hdl.handle.net/11285/703796A Neural Architecture Search (NAS) framework utilizing Evolutionary Algorithms (EAs) and a regressor model is proposed to improve the classification performance of Deep Neural Net- works (DNNs) for the early detection of Major Depressive Disorder (MDD) from speech data represented by Mel-Spectrograms. The framework automates the design of neural network architectures by systematically exploring a well-defined search space that integrates convo- lutional layers, batch normalization, dropout, max pooling, and self-attention mechanisms, aiming to capture both spatial and temporal features inherent in vocal signals. By optimiz- ing for the F1-score, the framework addresses challenges related to data imbalance, ensuring robust generalization across both depressed and non-depressed samples. The proposed approach employs an integer-based encoding scheme to represent candi- date architectures, coupled with repair and validation processes that ensure all architectures meet specific design constraints. A self-adaptive mechanism dynamically adjusts the muta- tion factor based on evolutionary feedback, improving the balance between exploration and exploitation during the search process. Furthermore, a surrogate model, built using Princi- pal Component Analysis (PCA) and XGBoost regressor, predicts architecture performance, significantly reducing computational costs by avoiding full model training for all candidates. Experimental validation, conducted on publicly available speech datasets, demonstrates that NAS-generated architectures may outperform manually designed state-of-the-art models in terms of F1-score, accuracy, precision, recall, and specificity. The results confirm the effec- tiveness of integrating self-attention mechanisms with convolutional operations for extracting relevant depression-related vocal biomarkers. This research underlines the potential of NAS in advancing non-invasive, scalable, and interpretable AI-driven tools for mental health as- sessment, contributing to early intervention strategies and improving clinical outcomes in depression diagnosis.TextoengopenAccesshttp://creativecommons.org/licenses/by-nc-sa/4.0INGENIERÍA Y TECNOLOGÍA::CIENCIAS TECNOLÓGICAS::TECNOLOGÍA MÉDICA::OTRASMEDICINA Y CIENCIAS DE LA SALUD::CIENCIAS MÉDICAS::PSIQUIATRÍA::PSICOTERAPIAMEDICINA Y CIENCIAS DE LA SALUD::CIENCIAS MÉDICAS::SALUD PÚBLICA::OTRASMedicineTechnologyScienceImproving deep neural networks to identify depression using neural architecture searchTesis de maestríaNeural Architecture Search (NAS)Depression DetectionSpeech AnalysisConvolutional Neural Networks (CNNs)Self-Attention MechanismEvolutionary AlgorithmsSpectrogramsSurrogate ModelsDeep LearningMental Health1317819