Ciencias Exactas y Ciencias de la Salud
Permanent URI for this collectionhttps://hdl.handle.net/11285/551039
Pertenecen a esta colección Tesis y Trabajos de grado de las Maestrías correspondientes a las Escuelas de Ingeniería y Ciencias así como a Medicina y Ciencias de la Salud.
Browse
Search Results
- Estimation of ancestry in the mexican population using informative genetic markers(Instituto Tecnológico y de Estudios Superiores de Monterrey, 2024) Valdez Alvarez, Héctor; Treviño Alvarado, Víctor Manuel; emipsanchez; Orozco Orozco, Lorena Sofía; García Ortiz, Humberto; Martínez Ledesma, Juan Emmanuel; Escuela de Ingeniería y Ciencias; Campus Monterrey; Garza Hernández, DeboraThe study of genetic ancestry has become an essential component of modern genetics, offering insights into the origins and migrations of human populations. This thesis presents the development of a genetic ancestry panel specifically tailored for the Mexican population, a group characterized by its high genetic diversity and complex admixture. The primary objective of this research is to accurately estimate the proportions of ancestry in Mexicans using informative genetic markers, thereby addressing the underrepresentation of this population in Genome-Wide Association Studies (GWAS). In the initial phase, various genetic databases were considered, and three were selected for the development of the ancestry panel: the 1000 Genomes Project (1000G), the Human Genome Diversity Project (HGDP), and the Metabolic Analysis in an Indigenous Sample (MAIS). The integration of these datasets provided a comprehensive view of genetic diversity crucial for the panel's accuracy. Principal Component Analysis (PCA) was employed to visualize the genetic structure and verify the separation of ancestral groups. The results confirmed the integrity of the selected datasets. Three methods for selecting Ancestry Informative Markers (AIMs)—Top K, Balanced K, and SumInfo K—were developed and evaluated. Although Balanced K and SumInfo K showed better performance than Top K, integrating Mexican data (MAIS) posed significant challenges, particularly due to the influence of East Asian populations. To address these issues, a revised strategy was implemented, focusing on optimizing AIM selection and improving the robustness of the panel. This involved a detailed workflow and validation process, ensuring the final panel's reliability. Despite the challenges, the new strategy demonstrated promising results, and the final panel is expected to be completed soon. The developed ancestry panel has significant implications for forensic science, personalized medicine, and anthropological research. By accurately estimating ancestry proportions in the Mexican population, this research contributes to a broader understanding of genetic diversity and supports more effective medical and forensic applications. Future work will focus on finalizing the panel and applying it to the oriGen project, which aims to analyze genetic data from a large cohort of Mexicans, further enhancing the understanding of this population's genetic landscape.
- Antimicrobial resistance prediction by bacterial genome-wide-association-study in non-fermenting bacilli with critical priority (Pseudomonas aeruginosa and acinetobacter baumannii).(Instituto Tecnológico y de Estudios Superiores de Monterrey, 2021-12-07) Barlandas Quintana, Erick Alan; MARTINEZ LEDESMA, JUAN EMMANUEL; 200096; Martinez Ledesma, Juan Emmanuel; puemcuervo; González Mendoza, Miguel; Garza González, Elvira; School of Engineering and Sciences; Campus Estado de México; Cuevas Díaz Durán, RaquelAntimicrobial resistance (AMR) (or drug resistance) is a natural phenomenon where microor- ganisms change their molecular, physical, or chemical structures to resist the drugs created by infections. The World Health Organization (WHO) had released for the first time a list of Multidrug-Resistant Bacteria (MRB) that pose the greatest threat to human health and for which new antibiotics are desperately needed. Acinetobacter baumannii and Pseudomonas aeruginosa resistant to carbapenems are part of the Gram-negative non-fermenting bacilli group with critical priority according to the WHO. For this, the final research purpose was to create and train a bioinformatic study capable of finding critical k-mers that could differentiate those strains of P. aeruginosa and A. baumannii resistant to carbapenems. Four k-mers sizes were performed for each bacterium (12, 14, 16, and 18), and two training and testing (70:30 and 80:20) schemas were used over seven different machine learning algorithms: Random Forrest, Adaboost, Xgboost, Decision Trees, Bagging Classifier, Support Vector Machine, and KNN. For both bacteria, the best models were obtained when using a k-mer length of 12. In the case of Acinetobacter baumannii, the best models obtained an accuracy of 0.99 for testing. Moreover, for Pseudomonas aeruginosa, the best accuracy obtained was 0.93 when us- ing Bagging Classifier. To investigate the sequences of the k-mers obtained, the National Cen- ter for Biotechnology Information (NCBI) Basic Local Alignment Search Tool BLAST was used. Ten to twenty sequences built with the k-mers were investigated for each model. When using a k-mer length of 12 for A. baumannii, 18 out of 20 sequences represented a crucial sequence in carbapenems (meropenem and imipenem) resistance. In the case of P. aerugi- nosa, 16 out of 20 sequences represented a key sequence. To complement this research, a Dynamic Programming algorithm was used to find changes over the reference genome that could explain the carbapenems resistance within the resistant genomes. Not all the resistant k-mer sequences were found over the reference genome, as some of them could be acquired by horizontal transference (Conjugation, Transformation, or Transduction inheritance). Fur- ther investigation over these sequences can be applied in creating new directed antibiotics or detecting easily resistant strains of Pseudomonas aeruginosa or Acinetobacter baumannii resistant to carbapenems.
- Association of gene expression signatures with genomic alterations and clinical outcomes(Instituto Tecnológico y de Estudios Superiores de Monterrey, 2020-11-01) Ramos García, Axel Alejandro; MARTINEZ LEDESMA, JUAN EMMANUEL; 200096; Martínez Ledesma, Juan Emmanuel; puelquio, emipsanchez; Treviño Alvarado, Víctor Manuel; Cuevas Díaz Durán, Raquel; Aguirre Gamboa, Raúl; Escuela de Ingeniería y Ciencias; Campus MonterreyTechnological advances applied to molecular biology, have led this discipline to perform several and more complex experiments, which outcomes have been summarized within massive databases, provoking the emergence of new disciplines as well as innovative approaches to analyze this bunch of data. One of these disciplines is Bioinformatics, where high-throughput data have been utilized to understand some diseases, such as cancer, which has been studied in order to provide a better classification, diagnosis, and provide new possible treatments to this condition. Available data go, from whole-genome sequencing to tissue images, proteomic, and metabolomic, etc. In the case of gene expression profiles, one of the most utilized study approaches is the performance of single-gene analysis, a test which consists in the measurement of the level of expression gene by gene, carrying out a comparison between the case and control samples by a statistical method (t-test, Wilcoxon-rank-sum), to assign a p-value related to every gene, then by a threshold filter process, we will be able to identify significant genes, and finally, proceed to give a biological interpretation from obtained results. However, this approach presents some lacks, within which, we can mention: Due to the adjustment process, (necessary for the number of tests performed) can lead to information loss, labeling wrongly as false-negative some relevant genes. The use of arbitrary threshold values, provokes discoveries to be falsely positive if the values for higher values or false negatives for lower values. Modifications in biological processes are related to groups of genes, thus, measuring the variation of the expression level of these groups of genes will let us to give a better biological interpretation. These groups of genes have been identified and nowadays we can find them within several public databases, these collections of gene sets are known as gene-set, and they could be used to provide better insight when analyzing expression data. Thus, the purpose of this thesis was to find, if the score-gotten through single-sample gene set enrichment analysis from the bibliography, Hallmark, Oncogenic, CMAP Up, CMAP Down collections is relevant to perform cancer subtype-classification by unsupervised learning techniques (Hierarchical clustering), identify involved pathways in the gene mutation presence or absence. Finally, re- late this score with the survival probability, we were able to determine the life expectancy of people and candidate treatment drugs, based on the level of expression from the determined gene set, related to a specific biological process, chemical alteration, or aberration.