Identifying models of DNA polymorphisms associated with alzheimer’s disease using step-wise and genetic algorithms from GWAS data

dc.audience.educationlevelInvestigadores/Researcherses_MX
dc.contributor.advisorTreviño Alvarado, Víctor Manuel
dc.contributor.authorRomero Rosales, Brissa Lizbeth
dc.contributor.committeememberVallejo Clemente, Edgar Emmanuel
dc.contributor.committeememberMoreno Treviño, María Guadalupe
dc.contributor.institutionEscuela de Ingeniería y Cienciases_MX
dc.contributor.institutionCampus Monterreyes_MX
dc.creatorROMERO ROSALES, BRISSA LIZBETH; 861461es_MX
dc.creatorTREVIÑO ALVARADO, VICTOR MANUEL; 205076es_MX
dc.date.accessioned2019-08-30T14:49:38Z
dc.date.available2019-08-30T14:49:38Z
dc.date.created2019-05
dc.date.issued2019-05
dc.description.abstractAlzheimer's disease is a neurodegenerative disorder that involves cognitive deterioration accompanied by memory loss and inability to reason, affecting the patient's ability to carry out daily activities. This disorder is caused by genetic, environmental and lifestyle factors. The determination of the genetic factors is very important because the disease can be prognosticated and therefore treated before it appears. However, despite research efforts and many putative detections using univariate analyses, only the APOE gene has been plentiful validated as a risk factor associated with late-onset Alzheimer's disease. Thus, the problem of missing heritability arises, implying that only one gene does not determine the heritability of a disorder, but the combined effect of genes could better explain it. Genome-Wide Association Studies (GWAS) traditionally use univariate techniques to determine the association between markers and diseases. This research proposes the use of machine learning techniques based on GWAS data to identify sets of polymorphisms that maximize discrimination between cases and controls. This document explains the traditional strategies and theoretical bases that support this research. It presents previous works that apply multivariate methods for the prediction of different diseases and treatments, and their most representative characteristics are considered the basis to inspire a new solution. The proposed methodology includes obtaining genetic data and a pre-processing stage. Afterward, the process involves several quality control procedures that filter samples and SNPs to reduce the number of false positives and false negatives. Next, a chi-squared association test with kinship correction is performed to pre-select markers. Predictive models are built using wrapper and embedded computational methods. The first wrapper method used is BSWiMS, which is based on statistics and procedures of forward and backward selection to generate a logistic model. Its best AUC was 0.689. The second wrapper method used is based on stochastic search and was an ensemble of Genetic Algorithms coupled to a Support Vector Machine classifier followed by a Forward Selection that achieved a maximum AUC of 0.716. The third algorithm used is LASSO, one of the most well-known embedded methods, which use L1-regularization and performs a feature selection process in the training stage of the model. This classifier achieved an AUC of 0.8005. This study incorporates the analysis of poorly classified samples in predictive models as a strategy to build higher predictive models. The best result obtained with the mixed model of the variants of previous models outperformed the others with an AUC of 0.842. This result is promising since the model generated with LASSO showed the highest discrimination between classes, based solely on genetic data. The biological relevance of the markers of the models is presented through their association with their respective gene. The models replicated variants previously associated with Alzheimer's disease, especially on chromosome 19 close to the APOE gene.es_MX
dc.description.degreeMaster of Science in Computer Sciencees_MX
dc.format.mediumTextoes_MX
dc.identificator7||33||3304||120304
dc.identifier.citationRomero Rosales, B. L. (2019). Identifying models of DNA polymorphisms associated with Alzheimer’s Disease using Step-Wise and Genetic Algorithms from GWAS data. Instituto Tecnológico y de Estudios Superiores de Monterrey. Campus Monterrey.es_MX
dc.identifier.urihttp://hdl.handle.net/11285/633075
dc.language.isospa
dc.publisherInstituto Tecnológico y de Estudios Superiores de Monterreyes_MX
dc.relation.impreso2019-04-26
dc.relation.isFormatOfversión publicadaes_MX
dc.relation.isreferencedbyREPOSITORIO NACIONAL CONACYT
dc.rightsopenAccess
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0*
dc.subject.classificationINGENIERÍA Y TECNOLOGÍA::CIENCIAS TECNOLÓGICAS::TECNOLOGÍA DE LOS ORDENADORES::INTELIGENCIA ARTIFICIALes_MX
dc.subject.keywordgases_MX
dc.subject.keywordmachine learninges_MX
dc.subject.keywordstep-wise methodses_MX
dc.subject.keywordgenetic algorithmses_MX
dc.subject.keywordalzheimer's diseasees_MX
dc.subject.lcshIngeniería y Ciencias Aplicadas / Engineering & Applied Scienceses_MX
dc.titleIdentifying models of DNA polymorphisms associated with alzheimer’s disease using step-wise and genetic algorithms from GWAS dataes_MX
dc.typeTesis de maestría

Files

Original bundle

Now showing 1 - 3 of 3
Loading...
Thumbnail Image
Name:
Caratula firmas.pdf
Size:
213.42 KB
Format:
Adobe Portable Document Format
Description:
Carátula de firmas
Loading...
Thumbnail Image
Name:
Carta de Autorización2.pdf
Size:
1.62 MB
Format:
Adobe Portable Document Format
Description:
Carta de Autorización
Loading...
Thumbnail Image
Name:
Thesis_May19_BrissaLizbethRomeroRosales.pdf
Size:
9.88 MB
Format:
Adobe Portable Document Format
Description:
Tesis

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.3 KB
Format:
Item-specific license agreed upon to submission
Description:
logo

El usuario tiene la obligación de utilizar los servicios y contenidos proporcionados por la Universidad, en particular, los impresos y recursos electrónicos, de conformidad con la legislación vigente y los principios de buena fe y en general usos aceptados, sin contravenir con su realización el orden público, especialmente, en el caso en que, para el adecuado desempeño de su actividad, necesita reproducir, distribuir, comunicar y/o poner a disposición, fragmentos de obras impresas o susceptibles de estar en formato analógico o digital, ya sea en soporte papel o electrónico. Ley 23/2006, de 7 de julio, por la que se modifica el texto revisado de la Ley de Propiedad Intelectual, aprobado

DSpace software copyright © 2002-2025

Licencia