Ciencias Exactas y Ciencias de la Salud

Permanent URI for this collectionhttps://hdl.handle.net/11285/551039

Pertenecen a esta colección Tesis y Trabajos de grado de las Maestrías correspondientes a las Escuelas de Ingeniería y Ciencias así como a Medicina y Ciencias de la Salud.

Browse

Search Results

Now showing 1 - 5 of 5
  • Tesis de maestría
    Enhancing single-cell and spatial transcriptomics analysis: the role of imputation and feature selection
    (Instituto Tecnológico y de Estudios Superiores de Monterrey, 2025-06) Chacón Ramírez Denisse; Rangel Escareño, Claudia; emipsanchez; Gómez Romero, Laura Lucila; Hernández Lemus, Enrique; Reséndis Antonio, Osbaldo; School of Engineering and Sciences; Campus Monterrey
    Single-cell RNA sequencing (scRNA-seq) and spatial transcriptomics have revolutionized our understanding of cellular heterogeneity and tissue organization. However, extracting biological insights from these technologies remains challenging due to high-dimensional, sparse, and noisy data. Two critical but understudied problems hinder robust analysis: (1) the impact of feature selection strategies on cell-type identification, and (2) the role of data imputation in integrating scRNA-seq with spatial transcriptomics. While clustering and integration methods are widely benchmarked, the influence of pre-processing decision, such as using biologically informed marker genes or imputing missing values, remains poorly understood. This thesis addresses these gaps through systematic evaluations. This thesis addresses these knowledge gaps through systematic evaluations across diverse datasets and algorithms. First, we assess how different imputation algorithms (MAGIC, DCA, scPHENIX) affect the integration of scRNA-seq with spatial transcriptomics in both ways, cell-type deconvolution and spatial transcript prediction. Using 13 paired datasets and 10 integration tools, we found that imputation’s benefits depend on the task and algorithm. The results reveal that imputation benefits are highly context-dependent rather than universally beneficial. SpaGE consistently outperformed other methods for transcript prediction regardless of imputation status, while RCTD demonstrated superior performance for cell deconvolution tasks. Notably, we observed that imputation primarily enhances magnitude estimation rather than improving spatial pattern preservation. Second, we evaluate whether marker gene-based feature selection improves scRNA-seq clustering accuracy compared to standard approaches. By benchmarking seven algorithms(Seurat, SC3, CIDR, etc.) across five pancreatic datasets, we demonstrate that performance gains are algorithm, and dataset-dependent. SC3 and TSCAN benefited from marker gene selection across multiple datasets, while SIMLR showed dramatic dataset-dependent responses,yielding superior ARI scores (greater than 0.7) in some contexts but diminished performance in others. The Segerstolpe dataset showed consistent improvements across most algorithms when using marker genes, suggesting dataset-specific characteristics strongly influence optimal feature selection strategies. Our analysis further revealed that algorithms often identify fewer clusters than reference annotations, indicating challenges in resolving fine-grained pancreatic cell type heterogeneity. The results of this thesis emphasize that pre-processing choices must align with both analytical goals and dataset characteristics to unlock the full potential of single-cell technologies. This work provides an evidence-based framework for optimizing spatial transcriptomics and scRNA-seq analysis workflows, with implications for understanding tissue architecture and cellular dynamics across diverse biological systems.
  • Tesis de maestría
    Magnetic gripper design optimization for robotic bending cell using artificial intelligence clustering of sheet metal parts
    (Instituto Tecnológico y de Estudios Superiores de Monterrey, 2024-11-25) Treviño Treviño, Ana Paula; Ahuett Garza, Horacio; emipsanchez; Urbina Coronado, Pedro Daniel; Orta Castañón, Pedro Antonio; School of Engineering and Sciences; Campus Monterrey
    The manufacturing sector is currently facing unprecedented challenges in adapting to the constantly evolving demands of diverse product lines and rapid market changes. Conventional manufacturing systems are struggling to adapt to the increasing variety of production components, leading to notable inefficiencies and heightened expenses. In this context, Reconfigurable Manufacturing Systems (RMS) have emerged as a prominent strategy to boost the adaptability and responsiveness of production processes. Therefore, the design and optimization of grippers for robotic arms are deemed essential to improve efficiency and productivity. The project aims to enhance gripper design by using AI clustering techniques and dimensional analysis to cluster production components and define design parameters for novel gripper configurations. This approach aligns with the tenets of lean manufacturing and data-driven decision-making, empowering manufacturing engineers and designers. The project also aims to optimize internal design and manufacturing, reducing reliance on external suppliers, and improving long-term adaptability and competitiveness by leveraging the cost reduction that in-house processes represent. The case study examines 964 sheet metal production components, highlighting inefficiencies of manual classification, part allocation challenges, and design specification retrieval. Furthermore, it explores different scenarios to render the best cluster quality possible with the supplied dataset and the constraints that materialize when translating the design parameters into actual design properties of the grippers, as well as the gripper-part compatibility. The thesis introduces an innovative method for managing part variety in gripper design by seizing advanced technologies and data-driven decision-making. This results in substantial enhancements in time efficiency, cost reduction, safety optimization, and the eradication of inefficient workflows within the manufacturing sector.
  • Trabajo de grado, maestría / master degree work
    Analysis of data-driven diabetes subgroups using machine learning for prediction of glycemic control and complications
    (Instituto Tecnológico y de Estudios Superiores de Monterrey, 2023-06-14) Nasser Kadamani, Sharif; Montesinos Silva, Luis Arturo; puemcuervo, emimayorquin; Aguilar Salinas, Carlos Alberto; Tamez Peña, José Gerardo; Treviño Alvarado, Víctor Manuel; School of Engineering and Sciences; Campus Monterrey; Santos Díaz, Alejandro
    Diabetes is a long-term illness that affects the metabolism and causes high blood sugar levels. This can result in severe harm to vital organs such as blood vessels, the heart, kidneys, eyes, and nerves over time. Type 2 diabetes is the most prevalent form, typically found in adults, and it develops when the pancreas is unable to produce enough insulin or when the body is unable to effectively use the insulin it produces, resulting in insulin resistance. Machine learning principles have been applied to develop algorithms that support predictive models for the risk of developing certain diseases, including diabetes, which has become a global pandemic. The management of this disease following a uniform treatment algorithm is usually linked to evolved treatment failure and the development of diabetic complications, so it has become necessary to create algorithms that consider the clinical characterization of the patient, which would be possible due to the recent advancements in the knowledge of the genomic architecture of diabetes and its complications. This research work aims to use machine learning techniques for classification to study the subgroups, generated through clustering techniques proposed by previous studies, to develop a continuation that allows the finding of patterns within subgroups for the generation of a model able to predict complications and the response to certain treatments of a patient given its clinical characterization, that is phenotypic data. This was done in collaboration with the Unidad de Investigación de Enfermedades Metabólicas (UIEM), which is part of the Instituto Nacional de Ciencias Medicas y Nutrición Salvador Zubirán (INCMNSZ), which provided data from cohorts of diabetic patients, as well as medical knowledge and feedback from health professionals. In this study, five clusters or data-driven diabetes subgroups were identified in a Mexican cohort based on six clinical characteristics. It is shown that it is possible to obtain similar subgroups to those found in other studies by using METS-IR and METS-VF instead of HOMA-IR and HOMA2-β formulas. These subgroups can be associated with different risks of complications, and progression rates from treatments, which might have relevance in advancing toward precision medicine. However, after including these subgroups in the training of classification models to predict glycemic control and assess the presence of complications, the performance of the models remained similar, that is no noticeable improvement was presented. This study showed that it is possible to build models that predict glycemic control after three months of initiating treatment with good performances, as well as create models that classify the presence of neuropathy. On the other hand, the generation of classification models for nephropathy and retinopathy did not yield acceptable performances.
  • Tesis de maestría
    Evolutionary clustering using classifiers: definition, implementation, scalability, and applications
    (Instituto Tecnológico y de Estudios Superiores de Monterrey, 2022-04-22) Sainz Tinajero, Benjamin Mario; GUTIERREZ RODRIGUEZ, ANDRES EDUARDO; 436765; Guiterrez Rodriguez, Andres Eduardo; puemcuervo; Ortiz Bayliss, Jose Carlos; Amaya Contreras, Ivan Mauricio; Medina Perez, Miguel Angel; School of Engineering and Sciences; Campus Estado de México
    Clustering is a Machine Learning tool for partitioning multi-dimensional data automatically into mutually exclusive groups, aiming to reflect the patterns of the phenomena it represents. Clustering algorithms perform this task conditioned by the clustering criterion modeled in its objective function. However, selecting the optimal criterion is a domain-dependent task that requires information on the cluster structure that a user often does not count on due to the unsupervised nature of the technique. Available approaches accentuate this problem as they perform clustering according to a similarity notion often limited to the concepts of compactness and connectedness, inducing bias and favoring clusters with certain shape, size, or density properties from using conventional distance functions. However, we cannot consider this a complete notion of a cluster because not every dataset will comply with both notions in the same proportion. Hence, research on this topic has not converged to a standard definition of a cluster, which raises the need for algorithms that produce adaptive solutions that mirror the underlying structures and relations within the data. This thesis is focused on the design of single-objective Evolutionary Clustering Algorithms that generate solutions that are not biased towards any cluster structure by optimizing a novel generalization clustering criterion. To achieve that, we designed objective functions modeled as a supervised learning problem, considering that a good partition should induce a well-trained classifier. That is how we decided to assess the quality of a clustering solution, according to its capability to train an ensemble of classifiers. The main contribution of this thesis is our series of Evolutionary Clustering Algorithms using Classifiers (the ECAC series), which introduces the aforementioned clustering criterion along with evolutionary computation. This meta-heuristic allows us to model distinct criteria to optimize while creating and evaluating multiple solutions along the process. The experimental results in the design of our family of methods ECAC, F1-ECAC, and ECAC-S, show an increase in similarity between the partitions created by our algorithms and the ground truth labels (obtained from the publicly available repositories where we retrieved the data) with a maximum Adjusted RAND Index of 0.96. Our second algorithm, F1-ECAC, proved the competitiveness of our contributions against traditional, single, and multi-objective Evolutionary Clustering algorithms showing no statistically significant difference against k-means, HG-means, and MOCLE. Our latest contribution, ECAC-S, was tested on a satellite image segmentation task, and it produced segmentations with higher average Adjusted RAND Index than k-means, Spectral-clustering, Birch, and DBSCAN in 4 out of 10 images.
  • Tesis de maestría
    Pipeline evaluation of clustering algorithms aimed at clinical data
    (Instituto Tecnológico y de Estudios Superiores de Monterrey, 2018-05-22) Duarte Dyck, David Absalón; Temez Peña, José G.; Terashima Marín, Hugo; Treviño Alvarado, Víctor M.
    Disease understanding is key in designing effective treatments and diagnostic tools. A key aspect of this understanding is grouping the patients according to their phenotypes. Phenotypes are patterns in the characteristics of certain members of a population that are correlated with a particular illness. This grouping may be useful in revealing associations between disease risk, treatment responses, and other key clinical outcomes. Once these associations are found, it is easier to design tailored diagnosis tools and effective personalized treatments. To achieve this grouping goal, data is key, and recent advancements in digital technology have made possible to capture hundreds and thousands of clinical data that may be used to group patients into different disease phenotypes. To handle hundreds of patients, with hundreds of features, clinical researchers use clustering algorithms that automatically find hiding association between subjects. These algorithms are very useful once the researcher selects the correct clustering and configure it to the specific research task. Selecting the correct clustering algorithm is time-consuming, and setting up their parameters may take several trail and test sessions. On the other hand, computer scientists have developed several clustering metrics that can evaluate the fitness of the clustering algorithms to the data, and computer power has increased, allowing the automated testing and evaluation of the clustering algorithms in the specific data set. The objective of this proposal was the development of an automated computer pipeline that evaluates several clustering algorithms, providing metrics regarding important features such as clustering stability (Jaccard index) and clustering relevance (ANOVA test). Furthermore, the pipeline returns the number of natural clusters that may be useful for the given dataset (Dunn index). The designed pipeline was set up to evaluate the classical clustering algorithms of k-means, Fuzzy C-means, and Hierarchical clustering, but it can be used to test a user-provided clustering method. The evaluation consisted in bootstrapping the data and extracting the Dunn and Jaccard clustering indexes in a meaningful manner. Furthermore, the clinical relevance of the final clusters was evaluated using an ANOVA test, that provided indications of disease phenotypes. All the test results are plotted and the user can visually evaluate the performance of the different clustering methods in their data. The result of this development was deployed in R (github.com/majordave/clustest). The utility of the pipeline was tested on synthetic data sets and two radiomics datasets associated with the development of Osteoarthritis (OA) and the presence of breast cancer from mammograms. Furthermore, we contrasted the closeting approach to supervised learning of a large dataset of the association of nutrition with OA symptoms. Hence, the present work established that the automated robust evaluation of the utility of clustering algorithms in clinical data is feasible, and provided a publicly available software tool that can be used by any clinical researchers to select the best clustering algorithm for their data.
En caso de no especificar algo distinto, estos materiales son compartidos bajo los siguientes términos: Atribución-No comercial-No derivadas CC BY-NC-ND http://www.creativecommons.mx/#licencias
logo

El usuario tiene la obligación de utilizar los servicios y contenidos proporcionados por la Universidad, en particular, los impresos y recursos electrónicos, de conformidad con la legislación vigente y los principios de buena fe y en general usos aceptados, sin contravenir con su realización el orden público, especialmente, en el caso en que, para el adecuado desempeño de su actividad, necesita reproducir, distribuir, comunicar y/o poner a disposición, fragmentos de obras impresas o susceptibles de estar en formato analógico o digital, ya sea en soporte papel o electrónico. Ley 23/2006, de 7 de julio, por la que se modifica el texto revisado de la Ley de Propiedad Intelectual, aprobado

DSpace software copyright © 2002-2026

Licencia