Ciencias Exactas y Ciencias de la Salud

Permanent URI for this collectionhttps://hdl.handle.net/11285/551039

Pertenecen a esta colección Tesis y Trabajos de grado de las Maestrías correspondientes a las Escuelas de Ingeniería y Ciencias así como a Medicina y Ciencias de la Salud.

Browse

Search Results

Now showing 1 - 10 of 10

Enhanced pipeline for pathogenic SNPs correlation: improving diagnostic precision
(Instituto Tecnológico y de Estudios Superiores de Monterrey, 2024-06) Gordillo Mata, Larissa Eugenia; Hernández Gress, Neil; emimmayorquin; Hervert Escobar, Laura; Hinojosa Alvarez, Silvia Alejandra; Reyes Figueroa, Angel David; School of Engineering and Sciences; Campus Monterrey; Chávez Santoscoy, Rocío Alejandra
Numerous genome-wide studies across diverse populations aim to trace key biomarkers associated with specific clinical outcomes. These elements are integral part of our research endeavor, as they harbor potentially critical insights that correlate to various diseases. Our primary focus lies in the identification of Single Nucleotide Polymorphisms (SNP) that are linked to illnesses and establishing a comprehensive database that catalogs pathogenic diseases. Creating a comprehensive repository serves as a facilitating resource for the identification of SNPs that are essential for the developing of the diagnostic tools which are tailored for these disorders. By leveraging datasets from esteemed repositories employing sophisticated data analytic techniques, and through meticulous analysis, we devised an algorithm capable of discerning relevant SNPs from a Variant Call Format (VCF) file. This tool streamlines the diagnostic process, furnishing geneticists with means to assess a genome to measure the susceptibility to diseases associated to certain variant. In essence, our research is based on providing genomic analysis using robust database methodologies that provides sophisticated diagnostic framework for the identification of genetic predispositions.
Unsupervised learning to profile emerging researchers in LATAM with Elsevier’s data
(Instituto Tecnológico y de Estudios Superiores de Monterrey, 2024-05) Figueroa Castillo, Jesús Manuel; Hernández Gress, Neil; mtyahinojosa, emimmayorquin; Ceballos Cancino, Héctor Gibrán; Estévez Bretón, Carlos Manuel; School of Engineering and Sciences; Campus Monterrey; Hervert Escobar, Laura
This proposal is being presented in Computer Science. High-impact researchers possess several key features based on their expertise; never theless, it takes time to establish themselves as leaders in their area. The objective of this research is to develop a model that can identify those outstanding researchers by discipline using indicators from the last five years of research and acknowledged databases such as Sco pus and Web of Science. Additionally, it will compare similarities across various disciplines to determine whether it is possible to predict researchers from one or more disciplines using the same model. The main objective of this research is to discover the characteristics that define a ”rising star” based on the concept of an early career researcher as a initial time window. It is important to mention that current metrics measure researchers’ performance through indicators known as H-index and its variants. However, these metrics often do not consider characteristics that differentiate one group from another. Through this unsupervised approach, we aim to f ind different groups that exist in LATAM to measure their characteristics more precisely and fairly, and to identify those high-impact researchers who may not be immediately apparent through indicators like the H-index. This thesis will demonstrate the process from data mining to the statistical analysis of the different groups.
Risk factor classification for drivers in Mexico using data science
(Instituto Tecnológico y de Estudios Superiores de Monterrey, 2023-05-31) Cadena Rodríguez, Rodrigo; Hernández Gress, Neil; puemcuervo,emipsanchez; Hernández Gress, Eva Selene; Ortiz Bayliss, José Carlos; Lozano Medina, Luis Angel; Escuela de Ingeniería y Ciencias; Campus Monterrey; Hervert Escobar, Laura
The aim of this dissertation is to find an optimal way to profile drivers in Mexico analysing different databases of car accidents and auto insurance claims inside this country and using gradient boosting algorithms. According to the National Public Health Institute, Mexico is in seventh-place globally and third place in Latin America in the most deaths caused by car accidents' ranking. Moreover, even when it is mandatory to have car insurance when having a car, only 30\% of people hires a car insurance. This is mainly because of the prices that insurance companies offer, and this happens because most of them are using old methods that do not consider all the crucial variables and treat all their customers as if everybody had the same risk for making a claim, even when companies in other countries are using some machine learning models that have been proved to be efficient and permitted a low-cost premium based on users profile.
Comparing Databases for the Prediction of Student’s Academic Performance using Data Science on the Novel Educational Model Tec21 at Tecnológico de Monterrey
(Instituto Tecnológico y de Estudios Superiores de Monterrey, 2021-06) Lara Castor, Miguel Andrés; HERNANDEZ GRESS, NEIL; 21847; Hernández Gress, Neil; tolmquevedo, emipsanchez; Batres Prieto, Rafael; Garza Villareal, Sara Elena; Escuela de Ingeniería y Ciencias; Campus Monterrey; Ceballos Cancino, Héctor Gibrán
Many studies have been made on the prediction of student's academic performance using Data Science. The students with poor academic performance as well as dropout students make a huge impact on the graduation rates, reputation, and finances of an educational institution. These studies take the advantage of the digitization of the admission and academic data of the students and the increasing computational power. However, since August 2019 Tecnologico de Monterrey has been doing it using entrance tests called Initial Evaluations. Unfortunately, the Initial evaluations did not provide useful predictions for the students of the fall semester in 2019. Therefore, this study aimed to compare the Initial Evaluations and the admissions data using Data Science models to predict the student's academic performance. The admission data was composed of five databases: Initial Evaluations, Emotions, Curriculum, Admission Exam and Grades of the first semester. A similar methodology to Cross Industry Standard Process for Data Mining was used to compare the models based on admission data and the models based only Initial Evaluations. A large number of experiments were carried out combining different data of admissions, feature reduction techniques and classification models. The experiments showed that the models based on admission data predicts the student's academic performance with higher accuracy than the models based only on Initial Evaluations. Nevertheless, some variables of the Initial Evaluations were relevant to the models based on admission data. Moreover, the accuracy of the experiments was in the range of the results from the related studies. The results of this study indicates that the Initial Evaluations provide useful information for the prediction of student's academic performance in the domain of Data Science.
Intent discovery from conversational logs to prepare a student admission chatbot for Tecnológico de Monterrey
(Instituto Tecnológico y de Estudios Superiores de Monterrey, 2021-05) Treviño Lozano, Rolando; Hernández Gress, Neil; tolmquevedo; Alvarado Uribe, Joanna; Castro Sánchez, Noé Alejandro; Escuela de Ingeniería y Ciencias; Campus Monterrey; Ceballos Cancino, Héctor Gibrán
Online chat services allow companies to serve and attend to their customers to resolve problems or doubts about a specific concept. Lately, conversational bots have been adapting to this domain, allowing a broader attention capacity while easing interactions between users and the company while also easing work for agents, increasing productivity and service quality. To design a chatbot is a time-consuming task as the designer has to provide the core key concepts known as intents that the conversational bot will respond to and provide example sentences and their respective answers. We propose a framework that receives as input data corresponding to conversational transcripts between prospects and agents and transform them through the use of regular expressions into a tabular dataset of the conversations in log format easing their analysis and representation to be converted into a convenient word representation of TF-IDF which serves as input for applying unsupervised machine learning algorithms as Non-Matrix Factorization for Topic Modeling and K-Means for utterance clustering to discover possible intents, which can then be passed on to the design of a knowledge base, which this last step of intent discovery allows an iterative process to process new conversations and identify changes in the intents or the addition of new ones. Results demonstrate that it is possible to cluster the utterances and find clusters that align to a possible intent out of a list of possible intents and such list is subject to change in time for continuously improving intent discovery. A cosine similarity threshold was set at 0.47 to differentiate correctly aligned clusters from those not aligned; 18 intents out of 55 were able to be correctly aligned with an initial intents list, and a total of 35 different intents were able to be captured by the clustering process. No exact similar research was found in the literature, as other works on the domain imply an already curated and labeled dataset to being working on classifying the intents rather than discovering them during the knowledge base design, also they do not take into account the whole process of transforming the raw conversations into a tabular and processed dataset.
Analyzing factors that impact alumni income with a machine learning approach
(Instituto Tecnológico y de Estudios Superiores de Monterrey, 2021-05) Gómez Cravioto, Daniela Alejandra; Hernández Gress, Neil; puelquio; Ceballos Cancino, Héctor; López Guajardo, Rafael; Ceballos Cancino, Héctor Gibrán; School of Engineering and Sciences; Campus Monterrey; Preciado Arreola, José Luis
This thesis presents an exploration of different machine-learning algorithms and different approaches for predicting alumni income. The aim is to obtain insights regarding the strongest predictors for income and a ``high" earners class. The study examines the alumni sample data obtained from a survey from Tec de Monterrey, a multi-campus Mexican private university. Survey results encompass 17,898 observations before cleaning and preprocessing and 12,275 observations after this. The dataset includes values for income and a large set of independent variables, including demographic and occupational attributes of the former students and academic attributes from the institution's history. For the problem of income prediction, there have been several attempts in both social science and econometric studies. However, this study investigates whether the accuracy of conventional algorithms in econometric research to predict income can be improved with a data science approach. Furthermore, we present insights obtained with explainable AI techniques. The results show that the Gradient Boosting Model outperformed the parametric models, Linear Regression and Logistic Regression, in predicting the current income of alumni with statistically significant results (p<0.05) in three different approaches: OLS regression, Multi-class Classification, and Binary Classification. The study also identified that for predicting the alum's first income after graduation, the Linear and Logistic Regression models were the most accurate methods, as the non-parametric models did not show a significant improvement. Succinctly, we identified that age, gender, working hours per week, their first income after graduation, and those factors related to their job position and their firm contributed to explaining their income. Simultaneously, post-graduation education and family background had an insignificant contribution to the model. In addition, the results, which showed a gender wage gap indicate that further work is required to enable equality in Mexico.
Design and implementation of a quantum multilayer neural network framewori
(Instituto Tecnológico y de Estudios Superiores de Monterrey, 2020-12) Gamboa Vázquez, Ariel Arturo Goubiah; HERNANDEZ GRESS, NEIL; 21847; Hernández Gress, Neil; puemcuervo; Aspuru Guzik, Alan; González Mendoza, Miguel; School of Engineering and Sciences; Campus Monterrey
Artificial Neurons are biologically inspired algorithms that form the building blocks for Artificial Neural Networks (ANNs) and Multilayer Neural Networks (MNN), which have been recently studied and implemented to solve important ptoblems. Advances in Learning theory and the availability of powerful computational systems has resulted in the creation of many real-world applications. Practically every industry has already adopted Multilayer Learning powered technologies in some part of their processes, as state of the art MNNpowered algorithms can outperform other algorithms and even human accuracy for a wide number of tasks. However, their performance relies heavily on the budget of data available as well as its format, as the most popular applications require a copious amount of training examples. Another limitation to build large scale MNN applications is the vast computational resources needed to build these systems. MNN based algorithms usage is widespread and also getting more complex, this phenomenon creates an ever-growing demand for computational power, which may no longer be satisfied at some point in the new future, thanks to the deceleration in state of the art monolithic processors’ performance. Quantum information theory, is a field that has had success in the last couple of decades, thanks to the creation of algorithms that are in theory able to outperform classical computers. The ability of quantum computers of working with inherently different physical systems than the ones used by classical computers, opens an exciting opportunity for scientists and companies to explore the performance of quantum computers for machine learning tasks, being multilayer learning a focus point, thanks to its importance in classical computing. Although a considerable amount of resources have been allocated to the development of MNN powered algorithms in quantum computers, there are still challenges left to overcome before Quantum Multilayer Neural Networks come to be a technology that can compete with state of the art MNN powered algorithms. This research explores the properties of multilayer neural network algorithms running on quantum computers. The first contribution of the research work reported in this document is the analysis and implementation of a perceptron algorithm running on a quantum computer. The second contribution is the proposal, implementation and analysis of two different information encoding methods for quantum computers. The final, and most important contribution of this work, is the development of a framework that allows training multilayer neural networks for Supervised Learning.
Financial Habits of Mexican Women using Machine Learning Algorithms
(Instituto Tecnológico y de Estudios Superiores de Monterrey, 2020-04) Lozano Medina, Jessica Ivonne; Hernández Gress, Neil; RR/tolmquevedo; Ceballos Cancino, Héctor Gibrán; Flores Segovia, Miguel Alejandro; School of Engineering and Sciences; Campus Monterrey; Hervert Escobar, Laura
This research was conducted under the Master in Computational Science program at Tecnológico de Monterrey. The proposal is a model to assess a profile risk for Mexican women, who require the service of a financial portfolio offered by a financial institution. Typically, women are scored with a lower financial risk than men. However, the understanding of variables and indicators that lead to such results, are not fully understood. Furthermore, the stochastic nature of the data makes it difficult to generate a suitable profile to offer an adequate financial portfolio to the women segment. Therefore, there is a great interest for developing methods that correctly model the behavior, and aid the decision-making process in financial services. Several models in the State-of-art for this type of analysis is done with linear programming and statistical techniques. Therefore, this study will use a benchmark of Machine Learning algorithms, such as Unsupervised and Supervised Learning algorithms, to extract information on four different datasets relevant to the population of interest. The first phase involves applying state-of-the-art techniques on public datasets of the Mexican population, whereas the second phase involves a future research involving a financial institution to create the model for the Women segment. It was found that financial habits of the population are heavily dependent on the region. There also an important group in the population characterized for not possessing an account in a financial institution and also not having emergency funds. In the case of the profiles of women, the most important attributes were their civil status and their participation in the workforce. The largest group of women are housewives, though the second largest group consists of married women who also participate in the workforce.
Portafolios de Inversión de Markovitz con SVMs Aplicadas a la Predicción y Análisis de Valores-Edición Única
(Instituto Tecnológico y de Estudios Superiores de Monterrey, 2005-09-01) Calderón Aveitua, Alejandro; Hernández Gress, Neil; González Mendoza, Miguel; Mora Vargas, Jaime; ITESM-CEM
Esta tesis aborda la administración de portafolios de inversión; la base teórica es Portfolio Selection de Harry Markowitz de 1952. A grandes rasgos, la teoría de Markowitz describe un algoritmo de dos pasos: 1. Estimación de retornos a futuro de valores que se pretende integren el portafolio de inversión. Este paso se abordó por medio de Máquinas de Soporte Vectorial (SVMs por sus siglas en inglés), desarrolladas por Vladimir Vapnik y descritas en Theory of Pattern Recognition, 1979; este enfoque permitió hacer regresiones para estimar el retorno a futuro de valores, usando como entrada de la regresión: índices, razones y precios; que se pueden encontrar fácilmente en fuentes financieras; aquí se encuentra principal aportación de la tesis, ya que dados los resultados favorables, se puede pensar en la posibilidad de llenar con las SVMs, la carencia en la teoría original de Markowitz de un método adecuado para estimar retornos, según lo reporta West Graeme en An Introduction to Modern Portfolio Theory: Markowitz, CAPM, APT and Black-Litterman. 2. Optimización de los valores que integran el portafolio de inversión. Aquí se usó el enfoque clásico descrito por Markowitz. La tesis está organizada como sigue: el capítulo 1 justifica, plantea objetivos y alcances del proyecto aquí reportado y proporciona una introducción a la administración de portafolios de inversión para normar nomenclatura financiera e introducir conceptos usados en este trabajo. El capítulo 2 contiene una descripción funcional del software que se desarrolló para este proyecto, con las justificaciones teóricas correspondientes; el capítulo 3 muestra los resultados experimentales obtenidos y las capacidades de análisis del software y el capítulo 4 discute las conclusiones; brevemente, los resultados sugieren que existe una relación causa-efecto entre las estimaciones de retorno obtenidas con las SVMs y los índices, razones y precios usados como entrada para la regresión; esto es importante, ya que en la comunidad financiera se le da mucho peso a una interpretación de la llamada Hipótesis de los Mercados eficientes, que implica que la única información necesaria para predecir precios (dado que es posible obtener retornos a partir de precios con simples operaciones aritméticas, se puede sustituir este término por retornos) de un valor son los precios pasados del mismo valor.
Métodos de Optimización para el Problema Cuadrático Generado por las Máquinas de Soporte Vectorial
(Instituto Tecnológico y de Estudios Superiores de Monterrey, 01/08/2004) Ibarra Orozco, Rodolfo E.; Hernández Gress, Neil; Frausto Solís, Juan; Mora Vargas, Jaime; Aguilar Martín, Joseph; ITESM

Ciencias Exactas y Ciencias de la Salud

Browse

Filters

Settings

Sort By

Results per page

Search Results