Ciencias Exactas y Ciencias de la Salud

Permanent URI for this collectionhttps://hdl.handle.net/11285/551039

Pertenecen a esta colección Tesis y Trabajos de grado de las Maestrías correspondientes a las Escuelas de Ingeniería y Ciencias así como a Medicina y Ciencias de la Salud.

Browse

Search Results

Now showing 1 - 7 of 7

Enhanced pipeline for pathogenic SNPs correlation: improving diagnostic precision
(Instituto Tecnológico y de Estudios Superiores de Monterrey, 2024-06) Gordillo Mata, Larissa Eugenia; Hernández Gress, Neil; emimmayorquin; Hervert Escobar, Laura; Hinojosa Alvarez, Silvia Alejandra; Reyes Figueroa, Angel David; School of Engineering and Sciences; Campus Monterrey; Chávez Santoscoy, Rocío Alejandra
Numerous genome-wide studies across diverse populations aim to trace key biomarkers associated with specific clinical outcomes. These elements are integral part of our research endeavor, as they harbor potentially critical insights that correlate to various diseases. Our primary focus lies in the identification of Single Nucleotide Polymorphisms (SNP) that are linked to illnesses and establishing a comprehensive database that catalogs pathogenic diseases. Creating a comprehensive repository serves as a facilitating resource for the identification of SNPs that are essential for the developing of the diagnostic tools which are tailored for these disorders. By leveraging datasets from esteemed repositories employing sophisticated data analytic techniques, and through meticulous analysis, we devised an algorithm capable of discerning relevant SNPs from a Variant Call Format (VCF) file. This tool streamlines the diagnostic process, furnishing geneticists with means to assess a genome to measure the susceptibility to diseases associated to certain variant. In essence, our research is based on providing genomic analysis using robust database methodologies that provides sophisticated diagnostic framework for the identification of genetic predispositions.
Unsupervised learning to profile emerging researchers in LATAM with Elsevier’s data
(Instituto Tecnológico y de Estudios Superiores de Monterrey, 2024-05) Figueroa Castillo, Jesús Manuel; Hernández Gress, Neil; mtyahinojosa, emimmayorquin; Ceballos Cancino, Héctor Gibrán; Estévez Bretón, Carlos Manuel; School of Engineering and Sciences; Campus Monterrey; Hervert Escobar, Laura
This proposal is being presented in Computer Science. High-impact researchers possess several key features based on their expertise; never theless, it takes time to establish themselves as leaders in their area. The objective of this research is to develop a model that can identify those outstanding researchers by discipline using indicators from the last five years of research and acknowledged databases such as Sco pus and Web of Science. Additionally, it will compare similarities across various disciplines to determine whether it is possible to predict researchers from one or more disciplines using the same model. The main objective of this research is to discover the characteristics that define a ”rising star” based on the concept of an early career researcher as a initial time window. It is important to mention that current metrics measure researchers’ performance through indicators known as H-index and its variants. However, these metrics often do not consider characteristics that differentiate one group from another. Through this unsupervised approach, we aim to f ind different groups that exist in LATAM to measure their characteristics more precisely and fairly, and to identify those high-impact researchers who may not be immediately apparent through indicators like the H-index. This thesis will demonstrate the process from data mining to the statistical analysis of the different groups.
Risk factor classification for drivers in Mexico using data science
(Instituto Tecnológico y de Estudios Superiores de Monterrey, 2023-05-31) Cadena Rodríguez, Rodrigo; Hernández Gress, Neil; puemcuervo,emipsanchez; Hernández Gress, Eva Selene; Ortiz Bayliss, José Carlos; Lozano Medina, Luis Angel; Escuela de Ingeniería y Ciencias; Campus Monterrey; Hervert Escobar, Laura
The aim of this dissertation is to find an optimal way to profile drivers in Mexico analysing different databases of car accidents and auto insurance claims inside this country and using gradient boosting algorithms. According to the National Public Health Institute, Mexico is in seventh-place globally and third place in Latin America in the most deaths caused by car accidents' ranking. Moreover, even when it is mandatory to have car insurance when having a car, only 30\% of people hires a car insurance. This is mainly because of the prices that insurance companies offer, and this happens because most of them are using old methods that do not consider all the crucial variables and treat all their customers as if everybody had the same risk for making a claim, even when companies in other countries are using some machine learning models that have been proved to be efficient and permitted a low-cost premium based on users profile.
Comparing Databases for the Prediction of Student’s Academic Performance using Data Science on the Novel Educational Model Tec21 at Tecnológico de Monterrey
(Instituto Tecnológico y de Estudios Superiores de Monterrey, 2021-06) Lara Castor, Miguel Andrés; HERNANDEZ GRESS, NEIL; 21847; Hernández Gress, Neil; tolmquevedo, emipsanchez; Batres Prieto, Rafael; Garza Villareal, Sara Elena; Escuela de Ingeniería y Ciencias; Campus Monterrey; Ceballos Cancino, Héctor Gibrán
Many studies have been made on the prediction of student's academic performance using Data Science. The students with poor academic performance as well as dropout students make a huge impact on the graduation rates, reputation, and finances of an educational institution. These studies take the advantage of the digitization of the admission and academic data of the students and the increasing computational power. However, since August 2019 Tecnologico de Monterrey has been doing it using entrance tests called Initial Evaluations. Unfortunately, the Initial evaluations did not provide useful predictions for the students of the fall semester in 2019. Therefore, this study aimed to compare the Initial Evaluations and the admissions data using Data Science models to predict the student's academic performance. The admission data was composed of five databases: Initial Evaluations, Emotions, Curriculum, Admission Exam and Grades of the first semester. A similar methodology to Cross Industry Standard Process for Data Mining was used to compare the models based on admission data and the models based only Initial Evaluations. A large number of experiments were carried out combining different data of admissions, feature reduction techniques and classification models. The experiments showed that the models based on admission data predicts the student's academic performance with higher accuracy than the models based only on Initial Evaluations. Nevertheless, some variables of the Initial Evaluations were relevant to the models based on admission data. Moreover, the accuracy of the experiments was in the range of the results from the related studies. The results of this study indicates that the Initial Evaluations provide useful information for the prediction of student's academic performance in the domain of Data Science.
Intent discovery from conversational logs to prepare a student admission chatbot for Tecnológico de Monterrey
(Instituto Tecnológico y de Estudios Superiores de Monterrey, 2021-05) Treviño Lozano, Rolando; Hernández Gress, Neil; tolmquevedo; Alvarado Uribe, Joanna; Castro Sánchez, Noé Alejandro; Escuela de Ingeniería y Ciencias; Campus Monterrey; Ceballos Cancino, Héctor Gibrán
Online chat services allow companies to serve and attend to their customers to resolve problems or doubts about a specific concept. Lately, conversational bots have been adapting to this domain, allowing a broader attention capacity while easing interactions between users and the company while also easing work for agents, increasing productivity and service quality. To design a chatbot is a time-consuming task as the designer has to provide the core key concepts known as intents that the conversational bot will respond to and provide example sentences and their respective answers. We propose a framework that receives as input data corresponding to conversational transcripts between prospects and agents and transform them through the use of regular expressions into a tabular dataset of the conversations in log format easing their analysis and representation to be converted into a convenient word representation of TF-IDF which serves as input for applying unsupervised machine learning algorithms as Non-Matrix Factorization for Topic Modeling and K-Means for utterance clustering to discover possible intents, which can then be passed on to the design of a knowledge base, which this last step of intent discovery allows an iterative process to process new conversations and identify changes in the intents or the addition of new ones. Results demonstrate that it is possible to cluster the utterances and find clusters that align to a possible intent out of a list of possible intents and such list is subject to change in time for continuously improving intent discovery. A cosine similarity threshold was set at 0.47 to differentiate correctly aligned clusters from those not aligned; 18 intents out of 55 were able to be correctly aligned with an initial intents list, and a total of 35 different intents were able to be captured by the clustering process. No exact similar research was found in the literature, as other works on the domain imply an already curated and labeled dataset to being working on classifying the intents rather than discovering them during the knowledge base design, also they do not take into account the whole process of transforming the raw conversations into a tabular and processed dataset.
Analyzing factors that impact alumni income with a machine learning approach
(Instituto Tecnológico y de Estudios Superiores de Monterrey, 2021-05) Gómez Cravioto, Daniela Alejandra; Hernández Gress, Neil; puelquio; Ceballos Cancino, Héctor; López Guajardo, Rafael; Ceballos Cancino, Héctor Gibrán; School of Engineering and Sciences; Campus Monterrey; Preciado Arreola, José Luis
This thesis presents an exploration of different machine-learning algorithms and different approaches for predicting alumni income. The aim is to obtain insights regarding the strongest predictors for income and a ``high" earners class. The study examines the alumni sample data obtained from a survey from Tec de Monterrey, a multi-campus Mexican private university. Survey results encompass 17,898 observations before cleaning and preprocessing and 12,275 observations after this. The dataset includes values for income and a large set of independent variables, including demographic and occupational attributes of the former students and academic attributes from the institution's history. For the problem of income prediction, there have been several attempts in both social science and econometric studies. However, this study investigates whether the accuracy of conventional algorithms in econometric research to predict income can be improved with a data science approach. Furthermore, we present insights obtained with explainable AI techniques. The results show that the Gradient Boosting Model outperformed the parametric models, Linear Regression and Logistic Regression, in predicting the current income of alumni with statistically significant results (p<0.05) in three different approaches: OLS regression, Multi-class Classification, and Binary Classification. The study also identified that for predicting the alum's first income after graduation, the Linear and Logistic Regression models were the most accurate methods, as the non-parametric models did not show a significant improvement. Succinctly, we identified that age, gender, working hours per week, their first income after graduation, and those factors related to their job position and their firm contributed to explaining their income. Simultaneously, post-graduation education and family background had an insignificant contribution to the model. In addition, the results, which showed a gender wage gap indicate that further work is required to enable equality in Mexico.
Financial Habits of Mexican Women using Machine Learning Algorithms
(Instituto Tecnológico y de Estudios Superiores de Monterrey, 2020-04) Lozano Medina, Jessica Ivonne; Hernández Gress, Neil; RR/tolmquevedo; Ceballos Cancino, Héctor Gibrán; Flores Segovia, Miguel Alejandro; School of Engineering and Sciences; Campus Monterrey; Hervert Escobar, Laura
This research was conducted under the Master in Computational Science program at Tecnológico de Monterrey. The proposal is a model to assess a profile risk for Mexican women, who require the service of a financial portfolio offered by a financial institution. Typically, women are scored with a lower financial risk than men. However, the understanding of variables and indicators that lead to such results, are not fully understood. Furthermore, the stochastic nature of the data makes it difficult to generate a suitable profile to offer an adequate financial portfolio to the women segment. Therefore, there is a great interest for developing methods that correctly model the behavior, and aid the decision-making process in financial services. Several models in the State-of-art for this type of analysis is done with linear programming and statistical techniques. Therefore, this study will use a benchmark of Machine Learning algorithms, such as Unsupervised and Supervised Learning algorithms, to extract information on four different datasets relevant to the population of interest. The first phase involves applying state-of-the-art techniques on public datasets of the Mexican population, whereas the second phase involves a future research involving a financial institution to create the model for the Women segment. It was found that financial habits of the population are heavily dependent on the region. There also an important group in the population characterized for not possessing an account in a financial institution and also not having emergency funds. In the case of the profiles of women, the most important attributes were their civil status and their participation in the workforce. The largest group of women are housewives, though the second largest group consists of married women who also participate in the workforce.

Ciencias Exactas y Ciencias de la Salud

Browse

Filters

Settings

Sort By

Results per page

Search Results