D3TEC Dataset: a data collection for deep learning research in depression classification featuring voice recordings of Spanish speakers using professional and cellphone microphones

dc.audience.educationlevelInvestigadores/Researcherses_MX
dc.contributor.advisorTrejo Rodríguez, Luis Ángel
dc.contributor.authorBrenes García, Luis Felipe
dc.contributor.catalogeremimmayorquin
dc.contributor.committeememberVillaseñor Pineda, Luis
dc.contributor.committeememberSosa Hernández, Víctor Adrián
dc.contributor.departmentSchool of Engineering and Scienceses_MX
dc.contributor.institutionCampus Monterreyes_MX
dc.contributor.mentorCantoral Ceballos, José Antonio
dc.date.accepted2024-05-23
dc.date.accessioned2025-05-10T00:34:26Z
dc.date.issued2024-05
dc.descriptionhttps://orcid.org/0000-0001-9741-4581
dc.description.abstractDepression is a mental health condition that affects millions of people worldwide. Although common, it remains difficult to diagnose due to its heterogeneous symptomatology. Mental health questionnaires are currently the most used assessment method to screen depression; these, however, have a subjective nature due to their dependence on patients' self-assessments. Researchers have been interested in finding an accurate way of identifying depression through an objective biomarker. Recent developments in neural networks and deep learning have enabled the possibility of classifying depression through the computational analysis of voice recordings. However, this approach is heavily dependent on the availability of datasets to train and test deep learning models, and these are scarce. There are also very few languages available. This study proposes a protocol for the collection of a new dataset for deep learning research on voice depression classification, featuring Spanish speakers, professional and smartphone microphones, and a high-quality recording standard. This work aims at creating a high-quality voice depression dataset by recording Spanish speakers with a professional microphone and strict audio quality standards. The data is captured by a smartphone microphone as well for further research in the use of smartphone applications for depression identification. Our methodology involves the strategic collection of depressed and non-depressed voice recordings. Three types of data are collected: voice recordings, depression labels (using the PHQ-9 questionnaire), and additional data that could potentially influence speech. Recordings are captured with professional-grade and smartphone microphones simultaneously to ensure versatility and practical applicability. Several considerations and guidelines are described to ensure high audio quality and avoid potential bias in deep learning research. This data collection effort immediately enables new research topics on depression classification. Some potential uses include deep learning research on Spanish speakers, an evaluation of the impact of audio quality on developing audio classification models, and an evaluation of the applicability of voice depression classification technology on smartphone applications. A preliminary experimentation section is included to showcase the potential research areas that the creation of this dataset enables. This research marks a significant step towards the objective and automated classification of depression in voice recordings. By focusing on the underrepresented demographic of Spanish speakers, the inclusion of smartphone recordings, and addressing the current data limitations in audio quality, this study lays the groundwork for future advancements in deep learning-driven mental health diagnosis.es_MX
dc.description.degreeMaster of Science in Computer Sciencees_MX
dc.format.mediumTextoes_MX
dc.identificator120318
dc.identifier.citationBrenes, L. F. (2024). D3TEC Dataset: A data collection for deep learning research in depression classification featuring voice recordings of Spanish speakers using professional and cellphone microphones. [Tesis maestría]. Instituto Tecnológico y de Estudios Superiores de Monterrey. Recuperado de: https://hdl.handle.net/11285/703640
dc.identifier.cvu1239618es_MX
dc.identifier.orcidhttps://orcid.org/0009-0003-9901-6140
dc.identifier.urihttps://hdl.handle.net/11285/703640
dc.language.isoenges_MX
dc.publisherInstituto Tecnológico y de Estudios Superiores de Monterreyes_MX
dc.relationInstituto Tecnológico y de Estudios Superiores de Monterrey
dc.relationCONAHCYT
dc.relationOrganización de los Estados Americanos (OEA)
dc.relation.isFormatOfacceptedVersiones_MX
dc.rightsopenAccesses_MX
dc.rights.urihttp://creativecommons.org/licenses/by-nd/4.0es_MX
dc.subject.classificationINGENIERÍA Y TECNOLOGÍA::CIENCIAS TECNOLÓGICAS::TECNOLOGÍA DE LOS ORDENADORES::SISTEMAS DE INFORMACIÓN, DISEÑO Y COMPONENTES
dc.subject.keywordAutomated depression diagnosis
dc.subject.keywordDepression
dc.subject.keywordSpanish language dataset
dc.subject.keywordDepression binary classification
dc.subject.keywordHigh-Quality Audio Data
dc.subject.keywordDeep Learning for Audio Analysis
dc.subject.keywordHigh-Quality and Smartphone Audio Recordings
dc.subject.lcshTechnologyes_MX
dc.titleD3TEC Dataset: a data collection for deep learning research in depression classification featuring voice recordings of Spanish speakers using professional and cellphone microphoneses_MX
dc.typeTesis de Maestría / master Thesises_MX

Files

Original bundle

Now showing 1 - 3 of 3
Loading...
Thumbnail Image
Name:
BrenesGarcıa_TesisMaestria.pdf
Size:
18.19 MB
Format:
Adobe Portable Document Format
Description:
Tesis Maestría
Loading...
Thumbnail Image
Name:
BrenesGarcia_CartaAutorizacion.pdf
Size:
77.58 KB
Format:
Adobe Portable Document Format
Description:
Carta Autorización
Loading...
Thumbnail Image
Name:
BrenesGarcıa_FirmaActadeGrado.pdf
Size:
320.2 KB
Format:
Adobe Portable Document Format
Description:
Firma Acta de Grado

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.3 KB
Format:
Item-specific license agreed upon to submission
Description:
logo

El usuario tiene la obligación de utilizar los servicios y contenidos proporcionados por la Universidad, en particular, los impresos y recursos electrónicos, de conformidad con la legislación vigente y los principios de buena fe y en general usos aceptados, sin contravenir con su realización el orden público, especialmente, en el caso en que, para el adecuado desempeño de su actividad, necesita reproducir, distribuir, comunicar y/o poner a disposición, fragmentos de obras impresas o susceptibles de estar en formato analógico o digital, ya sea en soporte papel o electrónico. Ley 23/2006, de 7 de julio, por la que se modifica el texto revisado de la Ley de Propiedad Intelectual, aprobado

DSpace software copyright © 2002-2026

Licencia