Ciencias Exactas y Ciencias de la Salud

Permanent URI for this collectionhttps://hdl.handle.net/11285/551039

Pertenecen a esta colección Tesis y Trabajos de grado de las Maestrías correspondientes a las Escuelas de Ingeniería y Ciencias así como a Medicina y Ciencias de la Salud.

Browse

Search Results

Now showing 1 - 3 of 3

ECG-based heartbeat classification for arrhythmia detection: a step-by-step AI exploratory process
(Instituto Tecnológico y de Estudios Superiores de Monterrey, 2022-12) Silva Mendez, Adrian; TAMEZ PEÑA, JOSE GERARDO; 67337; Tamez Peña, José Gerardo; emipsanchez; Gutiérrez Ruiz, Dania; Santos Díaz, Alejandro; Martínez Ledesma, Juan Emmanuel; School of Engineering and Sciences; Campus Monterrey
This document presents the thesis of “ECG-based heartbeat classification for arrhythmia detection: A step-by-step AI Exploratory Process” for the degree of Master in Computer Science at Tecnológico de Monterrey. One of the biggest causes of death around the world (including third and first world countries) are Cardiovascular Diseases. Arrhythmia is one of those diseases in which the heart beats at an inconsistent and abnormal rhythm due to a malfunction in the electrical system of the heart. The detection, diagnosis, and classification are very challenging tasks for doctors as time is a crucial factor on the table. If it is not done in time, the patient’s life can be at risk. This proposal explores different Data Pre-processing and Feature Generation techniques to create an efficient and accurate binary classification model capable of distinguishing normal from abnormal heartbeats with an Accuracy and Sensitivity ranging in the 80-90% with a 10% increase when compared to a RAW feature vector. One of the most important ideas discussed throughout this thesis includes decomposing the ECG signal in Frequency and Time domains usingDual Tree Complex Wavelet Transform to create a Feature Vector. Another important highlight of this thesis is database manipulation, including the exclusion and the correct distribution of subjects across the training and testing sets. The approach aims to test the feature vectors by training different Supervised Learning Models including K Nearest Neighbours, Random Forest, and X-Gradient Boosting. We will be using the MIT-BIH Arrhythmia Database for the experimentation process.
Development of a type two diabetes predictive model for mexicans applying to electronic health records dataset retrieved from National Public Data (ENSANUT 2018)
(Instituto Tecnológico y de Estudios Superiores de Monterrey, 2021-12-02) Fregoso Aparicio, Luis Martín; Noguez Monroy, Juana Julieta; puemcuervo; Cantú Ortiz, Francisco Javier; González Mendoza, Miguel; García García, José Antonio; School of Engineering and Sciences; Campus Estado de México; Montesinos Silva, Luis Arturo
Diabetes mellitus is a chronic and severe disease that occurs when the glucose levels in the blood rise above the limits because the body of the patient cannot produce insulin hormone or the amount is insufficient. Likewise, when the produced hormone is not able to be used efficiently. The American Diabetes Association establish to diagnosis Diabetes when the test of HbA1c is higher or equal to 6.5\%. Likewise, if basal fasting blood glucose (GB) is higher than 126 mg/dL or blood glucose 2 hours after an oral glucose tolerance test with 75 g of glucose (SOG) is greater or equal to 200 mg/dL. Type 2 diabetes (T2D), formerly known as adult-onset diabetes, is a form of diabetes characterized by high blood sugar, insulin resistance, and a relative lack of insulin. In Mexico, ten-point four percent of the population had diabetes in 2016, compared with 7\% of the population in 2006. In the past years, Machine Learning has been used to create a predictive model for the onset of type 2 diabetes, making it achievable to develop one for the Mexican population. The model should have the capacity to detect undiagnosed diabetics, applying a national public dataset of diabetes mellitus 2 in Mexico (ENSANUT 2018). The objective is to develop a predictive model of type 2 diabetes for Mexicans as a support tool helping primary care physicians make a timely diagnosis, preventing the onset of diabetes or its complications, detecting diabetes early with higher accuracy than the few Mexican models. A systematic review with 91 studies is performed to detect possible optimal machine learning techniques and features to create novel type 2 diabetes predictive models. Based on the PRISMA methodology combined with the methodology of Keele University and Durham University. The related work section results found that tree-type clusters of machine learning algorithms developed the best predictive models. There are five possible models Decision Tree, Random Forest, Gradient Boosting Tree, K-Nearest Neighborhood, and Logistic Regression to choose for classification diabetes. The database selected for the model is the National Health and Nutrition Survey (ENSANUT 2018), a tool that shows the general health and nutrition conditions of a representative sample of the population of Mexico. It is divided into several datasets joined by a unique ID created with values of their variables. The target (HEMGLICLASS) is a binary categorical variable which zero corresponds to a healthy person, and one is diabetic, and the complete database has 11639 samples and 55 attributes. After cleaning it and balancing the samples for diabetics and healthy, the final database has 21696 observations and 26 variables composed of the surveyed's categorization eating habits and their corresponding blood chemistry test values. Based on their metrics, after performing a model selection and optimization applying to the ENSANUT database, from the techniques described in the systematic review, Random Forest Classifier has the best metric for the prediction and could be interpreted it the physicians. The proposed model is a Random Forest with the default values with fifteen attributes from the original ENSANUT database. The attributes are related to the values of the testing blood measurements as the classical models and add new features like the intake of vegetables and fruits during the whole week as a protector or the enhancer in the case of an excessive intake of meat milky products or candies. Once the model was done, it was validated with the actual data to assure that the performance of the accuracy and AUC(ROC) keep higher than the 90 percent further other three metrics also are estimated. The results are accuracy: (0.90 $\pm$ 0.154), F1-Score: (0.86 $\pm$ 0.286) Precision: ( 0.94 $\pm$ 0.069), Sensitivity: (0.87 $\pm$ 0.294), and AUC(ROC): (0.92 $\pm$ 0.191). For proving the superior prediction capacity of the new model versus the Olimpia Arrellano-Campos model, equality of the means test with unknown variances is done with the T-student as estimator and p-value as the criterion to reject. The result is a p-value equal to 0.00572, demonstrating the improvement in the capacity of prediction by the model. Finally, the relevance of this model is the possibility to anticipate a diagnosis before the onset of symptoms, and even in the long term, anticipate the development of chronic complications. The model reflected this importance showing the complexity inherent to the detection of diabetes, generating a tool as simple as possible to support physicians in making a diagnosis. The ideal is to predict the onset before it is possible to call a pre-diabetic stage, but this model offers the possibility to generate a diagnosis near this stage.
Surface defect detection with predictive models in the galvanizing process
(Instituto Tecnológico y de Estudios Superiores de Monterrey, 2020-12-04) Pérez Benítez, Baruc Emet; PEREZ BENITEZ, BARUC EMET; 792305; Preciado Arreola, José Luis; emipsanchez; Tercero Gómez, Víctor Gustavo; Chee González, Carlos Arnoldo; Escuela de Ingeniería y Ciencias; Campus Monterrey
Hot-dip galvanizing is a widely used process worldwide to provide metal products with a protective layer that enhances its corrosion resistance. The effectiveness of such layer relies on the uniformity of the coverage, thus, any alteration in the galvanizing layer may be considered as a defect. These defects are catalogued as surface defects where two groups are identified: Bare Spots and Dross-Derived defects. Currently, these defects are detected at the end of the line where no preventive actions can be performed. Consequently, the surface defects’ occurrence is not avoided, increasing in turn the expenses of the company. For that reason, a project oriented to these defects’ prediction is proposed. This project consists on a set of predictive models, which are tested to be able to predict these defects’ occurrence at an early stage that let the people of the galvanizing line to design and unleash preventive actions that could alleviate the surface defects’ incidents. Four models are studied: Stepwise Logistic Regression, Random Forest Classifier, Gradient Boosting Classifier, and Low FNR Low FPR Random Forest Classifier (LFNR-LFPR RFC) ensemble. LFNR-LFPR RFC is a custom-made multi-objective ensemble designed in this project, which basic learners are two Random Forest Classifiers. To test the models’ performance, the False Negative Rate (FNR) and False Positive Rate (FPR) scores are employed, where the acceptance criteria is to at most have a 15% of FNR and a 25% FPR. From the models tested, LFNR-LFPR RFC was able to outperform the others while achieving FNR and FPR scores under the acceptance criteria for most of the studied cases (two out of three for Bare Spots and one out of two for Dross-Derived defects). Furthermore, the importance of the variables selected for the LFNR-LFPR RFC model was evaluated. As a result, variables from different sources, such as the galvanizing line per se, the chemistry of the coil and from upstream processes, were obtained. In turn, these lists of variables can provide insights on how to design preventive actions that could decrease the surface defects’ occurrence. Finally, the economic impact of the defects and the predictive models is assessed, where, according to the LFNR-LFPR RFC ensemble’s results, savings are possible.

Ciencias Exactas y Ciencias de la Salud

Browse

Filters

Settings

Sort By

Results per page

Search Results