Ciencias Exactas y Ciencias de la Salud

Permanent URI for this collectionhttps://hdl.handle.net/11285/551039

Pertenecen a esta colección Tesis y Trabajos de grado de las Maestrías correspondientes a las Escuelas de Ingeniería y Ciencias así como a Medicina y Ciencias de la Salud.

Browse

Search Results

Now showing 1 - 10 of 22
  • Tesis de maestría / master thesis
    Enhancing BGP security with MAD anomaly detection system and machine learning techniques
    (Instituto Tecnológico y de Estudios Superiores de Monterrey, 2024-12) Romo Chavero, María Andrea; Cantoral Ceballos, José Antonio; emipsanchez; Botero Vega, Juan Felipe; Navarro Barrón, Francisco Javier; School of Engineering and Sciences; Campus Monterrey; Pérez Díaz, Jesús Arturo
    Anomalies in the Border Gateway Protocol (BGP) represent a signicant vulnerability in the Internet’s infrastructure, as they can cause widespread disruptions, trafc misdirection, and even security breaches. Proactive detection of these anomalies is vital to preserving network stability and preventing potential cyberattacks. In response to this challenge, we present the Median Absolute Deviation (MAD) anomaly detection system, which combines traditional statistical methods with advanced machine learning (ML) techniques for more precise and dynamic detection. Our approach introduces a novel adaptive threshold mechanism, allowing the system to adjust based on the changing conditions of network trafc. This dynamic thresholding signif- icantly improves the accuracy, precision, and F1-score of anomaly detection compared to the previous xed-threshold version. Additionally, we integrate the MAD system with a diverse ML classiers, including Random Forest, XGBoost, LightGBM, CatBoost, and ExtraTrees to enhance the system’s ability to identify complex patterns that indicate unusual BGP behavior.We evaluate our detection system on well-documented BGP anomaly events, such as the Slammer worm, Nimda, Code Red 1 v2, the Moscow blackout, and the Telekom Malaysia misconguration. The results show that our system when combined with ML models achieves an overall accuracy and F1-score of 0.99, demonstrating its effectiveness across various anomaly types. By using both statistical and ML models, the system is able to capture irregularities that could signal security threats, offering a more comprehensive detection solution.This research highlights the importance of combining statistical anomaly detection with ML to obtain a balance between accuracy and computational efciency. The system’s low resource requirements and minimal pre-processing make it highly scalable, allowing it to be potentially deployed in real-time on large-scale networks.
  • Tesis de maestría / master thesis
    Multimodal data fusion algorithm for image classification
    (Instituto Tecnológico y de Estudios Superiores de Monterrey, 2024-11) Beder Sabag, Taleb; Vargas Rosales, César; emipsanchez; Pérez García, Benjamín de Jesús; School of Engineering and Sciences; Campus Monterrey
    IImage classification algorithms are a tool that can be implemented on a variety of research sectors, some of these researches need an extensive amount of data for the model to obtain appropriate results. A work around this problem is to implement a multimodal data fusion algorithm, a model that utilizes data from different acquisition frameworks to complement for the missing data. In this paper, we discuss about the generation of a CNN model for image classification using transfer learning from three types of architectures in order to compare their results and use the best model, we also implement a Spatial Pyramid Pooling layer to be able to use images with varying dimensions. The model is then tested on three uni-modal data-sets to analyze its performance and tune the hyperparameters of the model according to the results. Then we use the optimized architecture and hyperparameters to train a model on a multimodal data-set. The aim of this thesis is to generate a multimodal image classification model that can be used by researchers and people that need to analyze images for their own cause, avoiding the need to implement a model for a specific study.
  • Tesis de maestría / master thesis
    Machine translation for suicide detection: validating spanish datasetsusing machine and deep learning models
    (Instituto Tecnológico y de Estudios Superiores de Monterrey, 2024-11) Arenas Enciso, Francisco Ariel; Zareel, Mahdi; emipsanchez; García Ceja, Enrique Alejandro; Roshan Biswal, Rajesh; School of Engineering and Sciences; Sede EGADE Monterrey
    Suicide is a complex health concern that affects not only individuals but society as a whole. The application of traditional strategies to prevent, assess, and treat this condition has proven inefficient in a modern world in which interactions are mainly made online. Thus, in recent years, multidisciplinary efforts have explored how computational techniques could be applied to automatically detect individuals who desire to end their lives on textual input. Such methodologies rely on two main technical approaches: text-based classification and deep learning. Further, these methods rely on datasets labeled with relevant information, often sourced from clinically-curated social media posts or healthcare records, and more recently, public social media data has proven especially valuable for this purpose. Nonetheless, research focused on the application of computational algorithms for detecting suicide or its ideation is still an emerging field of study. In particular, investigations on this topic have recently considered specific factors, like language or socio-cultural contexts, that affect the causality, rationality, and intentionality of an individual’s manifestation, to improve the assessment made on textual data. Consequently, problems like the lack of data in non-Anglo-Saxon contexts capable of exploiting computational techniques for detecting suicidal ideation are still a pending endeavor. Thus, this thesis addresses the limited availability of suicide ideation datasets in non-Anglo-Saxon contexts, particularly for Spanish, despite its global significance as a widely spoken language. The research hypothesizes that Machine- Translated Spanish datasets can yield comparable results (within a ±5% performance range) to English datasets when training machine learning and deep learning models for suicide ideation detection. To test this, multiple machine translation models were evaluated, and the two most optimal models were selected to translate an English dataset of social media posts into Spanish. The English and translated Spanish datasets were then processed through a binary classification task using SVM, Logistic Regression, CNN, and LSTM models. Results demonstrated that the translated Spanish datasets achieved scores in performance metrics close to the original English set across all classifiers, with limited variations in accuracy, precision, recall, F1-score, ROC AUC, and MCC metrics remaining within the hypothesized ±5% range. For example, the SVM classifier on the translated Spanish sets achieved an accuracy of 90%, closely matching the 91% achieved on the original English set. These findings confirm that machine-translated datasets can serve as effective resources for training ML and DL models for suicide ideation detection in Spanish, thereby supporting the viability of extending suicide detection models to non-English-speaking populations. This contribution provides a methodological foundation for expanding suicide prevention tools to diverse linguistic and cultural contexts, potentially benefiting health organizations and academic institutions interested in psychological computation.
  • Tesis de maestría
    Emotion recognition based on physiological signals for Virtual Reality applications
    (Instituto Tecnológico y de Estudios Superiores de Monterrey, 2022-06-13) Oceguera Cuevas, Daniela; FUENTES AGUILAR, RITA QUETZIQUEL; 229297; Fuentes Aguilar, Rita Quetziquel; puemcuervo; Antelis Ortíz, Javier Mauricio; Fernández Cervantes, Victor; School of Engineering and Sciences; Campus Monterrey; Hernández Melgarejo, Gustavo
    Virtual Reality (VR) Systems have been used in the last years with an increasing frequency because they can be implemented for multiple applications in various fields. Some of these include aerospace, military, psychology, education, and entertainment. A way to increase the sense of presence is to induce emotions through the VE, and since one of the main purposes of VR Systems is to evoke the same emotions as a real experience would, the induction of emotions and emotion recognition could be used to enhance the experience. The emotion of a user can be recognized through the analysis and processing of physiological signals such as Electrocardiogram (ECG) and Electrodermal Activity (EDA) signals. However, very few systems that present online feedback regarding the subject’s emotional state and the possibility of adapting the VE during user experience have been developed. This thesis proposes the development of a Virtual Reality video game that can be dynamically modified according to the physiological signals of a user to regulate his emotional state. The first experiment served for the creation of a database. Previous studies have shown that specific features from these signals, can be used to develop algorithms capable of classifying the emotional states of the subjects into multiple classes or the two emotional dimensions: valence and arousal. Thus, this experiment helped to develop an appropriate Virtual Reality video game for stress induction, a signal acquisition, and conditioning system, a signal processing model and to extract time-domain signal features offline. A statistical analysis was performed to find significant differences between game stages and machine learning algorithms were trained and tested to perform classification offline. A second experiment was performed for the Proof of Concept Validation. For this, a model was created to extract features online and the classification algorithms were re-fitted with the online extracted features. Additionally, to facilitate a completely online process, the signal processing and feature extraction models were embedded on an STM32F446 Nucleo board, a strategy was implemented to dynamically modify the VE of the Virtual Reality video game according to the detected class, and the complete system was tested.
  • Tesis de maestría
    Evolutionary clustering using classifiers: definition, implementation, scalability, and applications
    (Instituto Tecnológico y de Estudios Superiores de Monterrey, 2022-04-22) Sainz Tinajero, Benjamin Mario; GUTIERREZ RODRIGUEZ, ANDRES EDUARDO; 436765; Guiterrez Rodriguez, Andres Eduardo; puemcuervo; Ortiz Bayliss, Jose Carlos; Amaya Contreras, Ivan Mauricio; Medina Perez, Miguel Angel; School of Engineering and Sciences; Campus Estado de México
    Clustering is a Machine Learning tool for partitioning multi-dimensional data automatically into mutually exclusive groups, aiming to reflect the patterns of the phenomena it represents. Clustering algorithms perform this task conditioned by the clustering criterion modeled in its objective function. However, selecting the optimal criterion is a domain-dependent task that requires information on the cluster structure that a user often does not count on due to the unsupervised nature of the technique. Available approaches accentuate this problem as they perform clustering according to a similarity notion often limited to the concepts of compactness and connectedness, inducing bias and favoring clusters with certain shape, size, or density properties from using conventional distance functions. However, we cannot consider this a complete notion of a cluster because not every dataset will comply with both notions in the same proportion. Hence, research on this topic has not converged to a standard definition of a cluster, which raises the need for algorithms that produce adaptive solutions that mirror the underlying structures and relations within the data. This thesis is focused on the design of single-objective Evolutionary Clustering Algorithms that generate solutions that are not biased towards any cluster structure by optimizing a novel generalization clustering criterion. To achieve that, we designed objective functions modeled as a supervised learning problem, considering that a good partition should induce a well-trained classifier. That is how we decided to assess the quality of a clustering solution, according to its capability to train an ensemble of classifiers. The main contribution of this thesis is our series of Evolutionary Clustering Algorithms using Classifiers (the ECAC series), which introduces the aforementioned clustering criterion along with evolutionary computation. This meta-heuristic allows us to model distinct criteria to optimize while creating and evaluating multiple solutions along the process. The experimental results in the design of our family of methods ECAC, F1-ECAC, and ECAC-S, show an increase in similarity between the partitions created by our algorithms and the ground truth labels (obtained from the publicly available repositories where we retrieved the data) with a maximum Adjusted RAND Index of 0.96. Our second algorithm, F1-ECAC, proved the competitiveness of our contributions against traditional, single, and multi-objective Evolutionary Clustering algorithms showing no statistically significant difference against k-means, HG-means, and MOCLE. Our latest contribution, ECAC-S, was tested on a satellite image segmentation task, and it produced segmentations with higher average Adjusted RAND Index than k-means, Spectral-clustering, Birch, and DBSCAN in 4 out of 10 images.
  • Tesis de maestría
    Antimicrobial resistance prediction by bacterial genome-wide-association-study in non-fermenting bacilli with critical priority (Pseudomonas aeruginosa and acinetobacter baumannii).
    (Instituto Tecnológico y de Estudios Superiores de Monterrey, 2021-12-07) Barlandas Quintana, Erick Alan; MARTINEZ LEDESMA, JUAN EMMANUEL; 200096; Martinez Ledesma, Juan Emmanuel; puemcuervo; González Mendoza, Miguel; Garza González, Elvira; School of Engineering and Sciences; Campus Estado de México; Cuevas Díaz Durán, Raquel
    Antimicrobial resistance (AMR) (or drug resistance) is a natural phenomenon where microor- ganisms change their molecular, physical, or chemical structures to resist the drugs created by infections. The World Health Organization (WHO) had released for the first time a list of Multidrug-Resistant Bacteria (MRB) that pose the greatest threat to human health and for which new antibiotics are desperately needed. Acinetobacter baumannii and Pseudomonas aeruginosa resistant to carbapenems are part of the Gram-negative non-fermenting bacilli group with critical priority according to the WHO. For this, the final research purpose was to create and train a bioinformatic study capable of finding critical k-mers that could differentiate those strains of P. aeruginosa and A. baumannii resistant to carbapenems. Four k-mers sizes were performed for each bacterium (12, 14, 16, and 18), and two training and testing (70:30 and 80:20) schemas were used over seven different machine learning algorithms: Random Forrest, Adaboost, Xgboost, Decision Trees, Bagging Classifier, Support Vector Machine, and KNN. For both bacteria, the best models were obtained when using a k-mer length of 12. In the case of Acinetobacter baumannii, the best models obtained an accuracy of 0.99 for testing. Moreover, for Pseudomonas aeruginosa, the best accuracy obtained was 0.93 when us- ing Bagging Classifier. To investigate the sequences of the k-mers obtained, the National Cen- ter for Biotechnology Information (NCBI) Basic Local Alignment Search Tool BLAST was used. Ten to twenty sequences built with the k-mers were investigated for each model. When using a k-mer length of 12 for A. baumannii, 18 out of 20 sequences represented a crucial sequence in carbapenems (meropenem and imipenem) resistance. In the case of P. aerugi- nosa, 16 out of 20 sequences represented a key sequence. To complement this research, a Dynamic Programming algorithm was used to find changes over the reference genome that could explain the carbapenems resistance within the resistant genomes. Not all the resistant k-mer sequences were found over the reference genome, as some of them could be acquired by horizontal transference (Conjugation, Transformation, or Transduction inheritance). Fur- ther investigation over these sequences can be applied in creating new directed antibiotics or detecting easily resistant strains of Pseudomonas aeruginosa or Acinetobacter baumannii resistant to carbapenems.
  • Tesis de maestría
    Sweet pepper recognition and peduncle pose estimation
    (Instituto Tecnológico y de Estudios Superiores de Monterrey, 2021-12-03) Montoya Cavero, Luis Enrique; Escobedo Cabello, Jesús Arturo; puemcuervo; Gómez Espinosa, Alfonso; School of Engineering and Sciences; Campus Monterrey; Díaz de León Torres, Rocío
    As a result of the ongoing workforce decrease in the agricultural industry, there is an increased interest in harvesting robots for specialty crops. While a lot of research is available for produce detection, current vision systems still struggle to detect ripe produce under challenging conditions such as varying lighting and highly occluded scenarios. Because of this, research has mostly focused on improving the detection, localization, and orientation estimation by using state-of-the-art algorithms and sensors. In this document, a deep learning sweet pepper detection and pose estimation framework is proposed. The framework uses high-resolution colored images from an active RGB-D-based sensor to detect and segment individual green, red, orange, and yellow pepper, and their peduncle (produce stem) by pixel using a mask and region-based convolutional neural network. Then, using depth information from the sensor it estimates the pepper’s 3D location and the z-axis orientation of the camera reference frame if the peduncle is visible. Peduncle detection and localization are crucial as sweet pepper must be harvested individually by cutting the peduncle. Otherwise, if harvested by grasping (a grabbing and pulling motion) there is a high risk that the produce will be damaged. To validate the precision of the sweet pepper detection and pose estimation framework, a small-scale robotic arm was used. This process involved moving the end-effector towards either the peduncle, if it was detected, or the produce if no peduncle was detected. The deep learning framework presented here achieved competitive results compared to state-of-the-art sweet pepper vision subsystems: the whole recognition process (detection and localization) average time was of 0.86 seconds even though higher resolution images were used, the detection process obtained a 𝑚𝐴𝑃@50=0.602 (mean average precision with an intersection over union of 50 percent), and regarding the localization process, the vision subsystem obtained a pose estimation error of 𝑥:±28.75 𝑚𝑚,𝑦: ±21.25 𝑚𝑚,𝑎𝑛𝑑 ±15 𝑚𝑚 and regarding z-axis orientation of the camera reference frame an error of ±9.6°.
  • Tesis de maestría
    Digital violence against women: a phenomenon exploration to understand and counteract from a Data Science perspective
    (Instituto Tecnológico y de Estudios Superiores de Monterrey, 2021-12-02) Reyes González, Gregorio Arturo; Cantú Ortiz, Francisco Javier; puemcuervo/tolmquevedo; Galeano Sánchez, Nathalíe María; Ceballos Cancino, Héctor Gibrán; Serrano Estrada, Leticia; School of Engineering and Sciences; Campus Monterrey; Gabarrot Arenas, Mariana
    Investigations have shown that Violence Against Women is a pervasive problem that has been increasing over the last years. Until a few years ago, it took place both in public and private spaces, but it has now broken into Digital Space adopting more symbolic expressions. Mexican cyberfeminists have fought to put this social problem on public agenda, achieving this past June to legally typify Violence Against Women in Digital Space at the federal level. There have been some important related work from Data Science approaches but mainly on cyberbullying and in the detection of language patterns through supervised algorithms, through social network features, through profile information, a few works on unsupervised learning, and on violence against women. However, it is important to tackle Digital Violence Against Women as a phenomenon with its particularities and separated from cyberbullying. Moreover, it is necessary to study this phenomenon from a gender perspective since all crimes against women are contained by a gender symbolic structure. The hypothesis of this Thesis Project is that Data Science approaches such as Text Mining, Supervised Learning, Time Series Analysis, Natural Language Processing, and Network Analysis can find associations between proposed variables of Spanish-language text data from microblogging social network, Twitter, datasets. The goal of this thesis is to implement Data Science techniques to analyze the Digital Violence Against Women phenomenon in order to achieve the identification of major associations that will let us understand and counteract violent social discourses and structural violence in digital space. The proposed model is composed of several techniques such as Time Series Analysis, Natural Language Processing, and Network Analysis, that are fed by the outcomes of the ensemble between Supervised Classifiers and an Ontological Matcher. Results indicate a higher presence of Digital Violence Against Women for the predicted tweets under the ensemble algorithm in comparison with just the Supervised Learning Algorithms or just the Ontological Matcher. Time Series Analysis shows peaks in Digital Violence Against Women in dates that correspond to days in which the fight for Women’s Rights was positioned. Natural Language Processing confirms the existence of a violent semantic discourse under this phenomenon. And, Network Analysis exhibits generalized individual attacks connected to a structural and systemic problem. Finally, there were four strategies proposed to counteract Digital Violence Against Women, which are based on detection, prevention, and specificity of the phenomenon.
  • Tesis de maestría
    Development of a type two diabetes predictive model for mexicans applying to electronic health records dataset retrieved from National Public Data (ENSANUT 2018)
    (Instituto Tecnológico y de Estudios Superiores de Monterrey, 2021-12-02) Fregoso Aparicio, Luis Martín; Noguez Monroy, Juana Julieta; puemcuervo; Cantú Ortiz, Francisco Javier; González Mendoza, Miguel; García García, José Antonio; School of Engineering and Sciences; Campus Estado de México; Montesinos Silva, Luis Arturo
    Diabetes mellitus is a chronic and severe disease that occurs when the glucose levels in the blood rise above the limits because the body of the patient cannot produce insulin hormone or the amount is insufficient. Likewise, when the produced hormone is not able to be used efficiently. The American Diabetes Association establish to diagnosis Diabetes when the test of HbA1c is higher or equal to 6.5\%. Likewise, if basal fasting blood glucose (GB) is higher than 126 mg/dL or blood glucose 2 hours after an oral glucose tolerance test with 75 g of glucose (SOG) is greater or equal to 200 mg/dL. Type 2 diabetes (T2D), formerly known as adult-onset diabetes, is a form of diabetes characterized by high blood sugar, insulin resistance, and a relative lack of insulin. In Mexico, ten-point four percent of the population had diabetes in 2016, compared with 7\% of the population in 2006. In the past years, Machine Learning has been used to create a predictive model for the onset of type 2 diabetes, making it achievable to develop one for the Mexican population. The model should have the capacity to detect undiagnosed diabetics, applying a national public dataset of diabetes mellitus 2 in Mexico (ENSANUT 2018). The objective is to develop a predictive model of type 2 diabetes for Mexicans as a support tool helping primary care physicians make a timely diagnosis, preventing the onset of diabetes or its complications, detecting diabetes early with higher accuracy than the few Mexican models. A systematic review with 91 studies is performed to detect possible optimal machine learning techniques and features to create novel type 2 diabetes predictive models. Based on the PRISMA methodology combined with the methodology of Keele University and Durham University. The related work section results found that tree-type clusters of machine learning algorithms developed the best predictive models. There are five possible models Decision Tree, Random Forest, Gradient Boosting Tree, K-Nearest Neighborhood, and Logistic Regression to choose for classification diabetes. The database selected for the model is the National Health and Nutrition Survey (ENSANUT 2018), a tool that shows the general health and nutrition conditions of a representative sample of the population of Mexico. It is divided into several datasets joined by a unique ID created with values of their variables. The target (HEMGLICLASS) is a binary categorical variable which zero corresponds to a healthy person, and one is diabetic, and the complete database has 11639 samples and 55 attributes. After cleaning it and balancing the samples for diabetics and healthy, the final database has 21696 observations and 26 variables composed of the surveyed's categorization eating habits and their corresponding blood chemistry test values. Based on their metrics, after performing a model selection and optimization applying to the ENSANUT database, from the techniques described in the systematic review, Random Forest Classifier has the best metric for the prediction and could be interpreted it the physicians. The proposed model is a Random Forest with the default values with fifteen attributes from the original ENSANUT database. The attributes are related to the values of the testing blood measurements as the classical models and add new features like the intake of vegetables and fruits during the whole week as a protector or the enhancer in the case of an excessive intake of meat milky products or candies. Once the model was done, it was validated with the actual data to assure that the performance of the accuracy and AUC(ROC) keep higher than the 90 percent further other three metrics also are estimated. The results are accuracy: (0.90 $\pm$ 0.154), F1-Score: (0.86 $\pm$ 0.286) Precision: ( 0.94 $\pm$ 0.069), Sensitivity: (0.87 $\pm$ 0.294), and AUC(ROC): (0.92 $\pm$ 0.191). For proving the superior prediction capacity of the new model versus the Olimpia Arrellano-Campos model, equality of the means test with unknown variances is done with the T-student as estimator and p-value as the criterion to reject. The result is a p-value equal to 0.00572, demonstrating the improvement in the capacity of prediction by the model. Finally, the relevance of this model is the possibility to anticipate a diagnosis before the onset of symptoms, and even in the long term, anticipate the development of chronic complications. The model reflected this importance showing the complexity inherent to the detection of diabetes, generating a tool as simple as possible to support physicians in making a diagnosis. The ideal is to predict the onset before it is possible to call a pre-diabetic stage, but this model offers the possibility to generate a diagnosis near this stage.
  • Tesis de maestría
    Machine learning to predict rework time for CNC router
    (Instituto Tecnológico y de Estudios Superiores de Monterrey, 2021-11-30) González Giacoman, Daniel Alejandro; URBINA CORONADO, PEDRO DANIEL; 298324; Urbina Coronado, Pedro Daniel; puemcuervo; Orta Castañón, Pedro Antonio; Ahuett Garza, Horacio; School of Engineering and Sciences; Campus Monterrey
    The industry is always in constant change and looking for ways to gain an advantage over its competitors. The fourth industrial revolution has brought massive change to the way things are done in the industry. The fourth industrial revolution brought Big Data, the Internet of things and Artificial intelligence, which gives us new ways to gather a lot of information from different sources and use it for our benefit. The present work develops a methodology to create a new machine learning algorithm to predict rework time for pieces that come out of a CNC router, using python and prove that for this case the created algorithm is better than a statistical model. To validate the methodology and prove the hypothesis of the thesis an experiment will be made to obtain 2 results: the best set of cutting parameters for the selected material and which is the best machine learning algorithm for this problem. To make the experiment the parameters must be set, a database needs to be created to train and test the ML algorithms and the code and libraries to be used should be created to fit the problem to be solved. This will be done by giving a background into databases, artificial intelligence, and how to know by the given results which type of artificial intelligence method is the best for the proposed problem.
En caso de no especificar algo distinto, estos materiales son compartidos bajo los siguientes términos: Atribución-No comercial-No derivadas CC BY-NC-ND http://www.creativecommons.mx/#licencias
logo

El usuario tiene la obligación de utilizar los servicios y contenidos proporcionados por la Universidad, en particular, los impresos y recursos electrónicos, de conformidad con la legislación vigente y los principios de buena fe y en general usos aceptados, sin contravenir con su realización el orden público, especialmente, en el caso en que, para el adecuado desempeño de su actividad, necesita reproducir, distribuir, comunicar y/o poner a disposición, fragmentos de obras impresas o susceptibles de estar en formato analógico o digital, ya sea en soporte papel o electrónico. Ley 23/2006, de 7 de julio, por la que se modifica el texto revisado de la Ley de Propiedad Intelectual, aprobado

DSpace software copyright © 2002-2025

Licencia