Ciencias Exactas y Ciencias de la Salud
Permanent URI for this collectionhttps://hdl.handle.net/11285/551014
Pertenecen a esta colección Tesis y Trabajos de grado de los Doctorados correspondientes a las Escuelas de Ingeniería y Ciencias así como a Medicina y Ciencias de la Salud.
Browse
Search Results
- Automatic detection of mental health disorders in social media(Instituto Tecnológico y de Estudios Superiores de Monterrey, 2024-06-12) Villa Pérez, Miryam Elizabeth; Trejo Rodríguez, Luis Ángel; emipsanchez; González Mendoza, Miguel; Brena Pinero, Ramón Felipe; Moctezuma Ochoa, Daniela; Villaseñor Pineda, Luis; School of Engineering and Sciences; Campus Estado de MéxicoWith the rise of social media, these platforms have emerged as a crucial source of information for studying people's thoughts and behaviors. By using natural language processing and machine learning techniques, prior studies have explored the language of users living with different mental health conditions. However, these efforts have focused on analyzing conditions in isolation, particularly depression, and have relied on English-language data. The goal of this study is to examine the communications of English- and Spanish-speaking Twitter users through traditional and deep learning algorithms to automatically recognize whether they live with one of nine mental health conditions. To achieve that, we created two datasets in English and Spanish. The “diagnosed” set comprises the timeline of 1,500 users who explicitly reported in one or more of their posts having been diagnosed with one of the following: ADHD, Anxiety, Autism, Bipolar, Depression, Eating disorders, OCD, PTSD, and Schizophrenia. The “control” set comprises the timeline of 1,700 randomly selected users who had not disclosed a diagnosis. We extracted a variety of text features from the collected data, such as n-grams, q-grams, Part-of-speech (POS) tags, topic modeling, Linguistic Inquiry and Word Count (LIWC), and word embeddings, and trained traditional machine learning and deep learning classifiers for two tasks: binary classification, to distinguish between diagnosed and non-diagnosed users, and multiclass classification, to identify the specific diagnosis. The performance of the models was analyzed using 5-fold cross-validation, four different classification metrics (AUC, F1-score, Precision, and Recall), and the Friedman non-parametric test with the Finner post-hoc procedure. Overall, XGBoost and CNN performed the best in the two classification tasks. Employing our collected datasets, in binary classification, we achieved an AUC of 0.835 on the Spanish Twitter dataset using n-grams of words from one to three (UBT) and 0.846 on the English Twitter dataset with a 5-gram characters (C5) model. In multiclass classification, we obtained an AUC of 0.747 and 0.697 in the Spanish and English Twitter datasets, respectively. In the second phase of our research, we introduced a model named BiLEMD for the multiclass classification of mental disorders. Our approach adopts a hierarchical detection strategy, where each base model within our framework leverages diverse textual features. We aim to emulate, to some extent, the step-by-step approach employed in human clinical diagnostics. In clinical practice, professionals first determine the presence or absence of a condition before proceeding to specify its type. Although BiLEMD achieved the highest ranking in both the Spanish and English Twitter datasets, statistical significance differences were not observed. Nevertheless, additional analysis revealed that ensembles, including BiLEMD and Stacking, reduce misclassification within the control class. Moreover, BiLEMD exhibits slightly superior performance in terms of AUC and Recall compared to other classifiers. The development of computer-based methods for recognizing and classifying social media user profiles related to different mental health conditions could enhance the performance of applications aimed at early diagnosis and timely treatment.
- Expansion of a novel bio-inspired supervised machine learning class applied to financial forecasting(Instituto Tecnológico y de Estudios Superiores de Monterrey, 2020-02) González Núñez, Enrique; Trejo Rodríguez, Luis Ángel; emipsanchez; Hervert Escobar, Laura; Capote Sanchez, Alfredo Alberto Ramon; Ponce Espinosa, Hiram Eredín; Kampouridis, Michael; Escuela de Ingeniería y Ciencias; Campus Estado de MéxicoThe aim presented in this research consists of applying the Artificial Organic Networks (AON), a nature-inspired, supervised, metaheuristic, machine learning framework, toward the defini- tion of a new algorithm based on this machine learning class, capable of employing it for computational finance purposes, specifically, for the modeling and prediction of a stock mar- ket, based on the Index Tracking Problem (ITP). The relevance of computational finance is discussed, pointing out that is an area that has developed significantly in the last decades with different applications, some of these are: rich portfolio optimization, index-tracking, credit risk, stock investment, among others. Specifically, the Index Tracking Problem (ITP) con- cerns the prediction of stock market prices, being this a complex problem of the kind NP-hard. In this regard, this work discusses the innovative approach to implement the AON method to tackle the ITP; thus, the concept of Artificial Halocarbon Compounds or AHC-algorithm is introduced as a supervised machine learning algorithm, and as a new topology based on the AON framework. Through the discussion we review some of the disadvantages that the origi- nal Artificial Hydrocarbon Networks (AHN) topology has; these disadvantages are considered regarding the definition of the new AHC-algorithm. The proficiencies of the AHC model ca- pabilities are measured by modeling the IPC Mexico stock market index, with the aid of other economic indicators, having obtained very promising results, with a computed R-square of 0.9919, and an 8e-4 mean relative error for the forecast; and as a main contribution, the new model is an adaptable, dynamic, and reconfigurable topology, that can be applied to different approaches or systems that require simulation analysis using time series.

