Ciencias Exactas y Ciencias de la Salud

Permanent URI for this collectionhttps://hdl.handle.net/11285/551039

Pertenecen a esta colección Tesis y Trabajos de grado de las Maestrías correspondientes a las Escuelas de Ingeniería y Ciencias así como a Medicina y Ciencias de la Salud.

Browse

Search Results

Now showing 1 - 1 of 1
  • Tesis de maestría / master thesis
    The role of capitalization and character repetition in identifying depression on social Media: a bilingual approach
    (Instituto Tecnológico y de Estudios Superiores de Monterrey, 2024-11-24) Burgueño Paz, Luis Humberto; Zareei, Mahdi; emipsanchez; Roshan Biswal, Rajesh; School of Engineering and Sciences; Campus Monterrey; García Ceja, Enrique Alejandro
    Depression is a mental disorder that affects millions of people worldwide, but a significant portion of the affected people don’t receive adequate treatment. There has been an increasing interest from researchers to detect this condition through social media posts in order to prompt for early treatment. However, most of the research has been focused on the Caucasian Western English-speaking population, limiting the applicability of their findings across diverse cultural contexts. While research has shown the use of nonverbal cues to convey sentiment, their role on depression detection remains under-explored. This thesis aims to assess the effect of nonverbal cues, specifically capitalization and character repetition, on depression detection using datasets both in English and Spanish. This effect was explored through three existing datasets. The first dataset included a collection of Reddit posts and comments in the English language and was selected to assess the effect on a dataset coming from one of the most reputable mental health competitions in Natural Language Processing. The second dataset consisted of a collection of Spanish- language messages from Telegram to verify whether findings in the English language would hold for Spanish. The third dataset, also built from Reddit posts, was used to analyze the impact of these features when classifying by depression severity levels rather than binary labels. Four classifiers were used throughout this research: Logistic Regression, Random Forest, Support Vector Machine, and Neural Network. Overall, the impact of capitalization and character repetition for depression detection was found to be minimal. These features had the most effect on English Reddit data with binary labels, while showing limited impact on Spanish data or when classifying by severity levels. Additionally, models using only character repetition outperformed those relying on capitalization features.
En caso de no especificar algo distinto, estos materiales son compartidos bajo los siguientes términos: Atribución-No comercial-No derivadas CC BY-NC-ND http://www.creativecommons.mx/#licencias
logo

El usuario tiene la obligación de utilizar los servicios y contenidos proporcionados por la Universidad, en particular, los impresos y recursos electrónicos, de conformidad con la legislación vigente y los principios de buena fe y en general usos aceptados, sin contravenir con su realización el orden público, especialmente, en el caso en que, para el adecuado desempeño de su actividad, necesita reproducir, distribuir, comunicar y/o poner a disposición, fragmentos de obras impresas o susceptibles de estar en formato analógico o digital, ya sea en soporte papel o electrónico. Ley 23/2006, de 7 de julio, por la que se modifica el texto revisado de la Ley de Propiedad Intelectual, aprobado

DSpace software copyright © 2002-2025

Licencia