Ciencias Exactas y Ciencias de la Salud
Permanent URI for this collectionhttps://hdl.handle.net/11285/551039
Pertenecen a esta colección Tesis y Trabajos de grado de las Maestrías correspondientes a las Escuelas de Ingeniería y Ciencias así como a Medicina y Ciencias de la Salud.
Browse
Search Results
- Association of gene expression signatures with genomic alterations and clinical outcomes(Instituto Tecnológico y de Estudios Superiores de Monterrey, 2020-11-01) Ramos García, Axel Alejandro; MARTINEZ LEDESMA, JUAN EMMANUEL; 200096; Martínez Ledesma, Juan Emmanuel; puelquio, emipsanchez; Treviño Alvarado, Víctor Manuel; Cuevas Díaz Durán, Raquel; Aguirre Gamboa, Raúl; Escuela de Ingeniería y Ciencias; Campus MonterreyTechnological advances applied to molecular biology, have led this discipline to perform several and more complex experiments, which outcomes have been summarized within massive databases, provoking the emergence of new disciplines as well as innovative approaches to analyze this bunch of data. One of these disciplines is Bioinformatics, where high-throughput data have been utilized to understand some diseases, such as cancer, which has been studied in order to provide a better classification, diagnosis, and provide new possible treatments to this condition. Available data go, from whole-genome sequencing to tissue images, proteomic, and metabolomic, etc. In the case of gene expression profiles, one of the most utilized study approaches is the performance of single-gene analysis, a test which consists in the measurement of the level of expression gene by gene, carrying out a comparison between the case and control samples by a statistical method (t-test, Wilcoxon-rank-sum), to assign a p-value related to every gene, then by a threshold filter process, we will be able to identify significant genes, and finally, proceed to give a biological interpretation from obtained results. However, this approach presents some lacks, within which, we can mention: Due to the adjustment process, (necessary for the number of tests performed) can lead to information loss, labeling wrongly as false-negative some relevant genes. The use of arbitrary threshold values, provokes discoveries to be falsely positive if the values for higher values or false negatives for lower values. Modifications in biological processes are related to groups of genes, thus, measuring the variation of the expression level of these groups of genes will let us to give a better biological interpretation. These groups of genes have been identified and nowadays we can find them within several public databases, these collections of gene sets are known as gene-set, and they could be used to provide better insight when analyzing expression data. Thus, the purpose of this thesis was to find, if the score-gotten through single-sample gene set enrichment analysis from the bibliography, Hallmark, Oncogenic, CMAP Up, CMAP Down collections is relevant to perform cancer subtype-classification by unsupervised learning techniques (Hierarchical clustering), identify involved pathways in the gene mutation presence or absence. Finally, re- late this score with the survival probability, we were able to determine the life expectancy of people and candidate treatment drugs, based on the level of expression from the determined gene set, related to a specific biological process, chemical alteration, or aberration.
- Predicting drug Responses in cancer cells using genomic features and machine learning(Instituto Tecnológico y de Estudios Superiores de Monterrey, 2020-05) Evans Trejo, Cody Eduardo; Treviño Alvarado, Víctor Manuel; ilquio; Tamez Peña, José; Martínez Torteya, Antonio; Escuela de ingeniería y ciencias; Campus Estado de México; Martínez Ledesma, Juan EmmanuelThis document presents an analysis for the prediction drug responses in cancer cells using cancer genomic features and machine learning for the Master’s Degree in Computational Sciences at Instituto Tecnologico y de Estudios Superiores de Monterrey. Cancer is a genetic disease characterized by the progressive accumulation of mutations. There are several genomic features involved in oncogenesis such: gene mutation, copy number, expression, and epigenetic alterations. These features vary depending the person and type of cancer, making it difficult to determine whether a drug will response successfully for each specific case. Recently, two large-scale pharmacogenomic studies screened multiple anticancer drugs on over 1000 cell lines in an effort to elucidate the response mechanism of anticancer drugs. Based on this data, we proposed a drug-response prediction framework that uses gene expression, methylation, copy number, mutation, protein expression features and drug sensitivity data from the Cancer Cell Line Encyclopedia (CCLE) database. For this we compare the performance of several algorithms such as Random Forest, Support Vector Machine, Elastic-Net and Extreme Gradient Boosting Tree (XGBoost). Robustness of our model was validated by cross-validation. The dataset of RNAseq using XGBoost obtain the highest average accuracy for individual datasets. Our unified model achieved good cross validation performance for most drugs in the Cancer Cell Line Encyclopedia (≥85 % accuracy).These results suggest that drug response could be effectively predicted from genomic features using a battery of machine learning algorithm. Our model could be applied to predict drug response for certain drugs and potentially could play a complementary role in personalized medicine.