Ciencias Exactas y Ciencias de la Salud

Permanent URI for this collectionhttps://hdl.handle.net/11285/551039

Pertenecen a esta colección Tesis y Trabajos de grado de las Maestrías correspondientes a las Escuelas de Ingeniería y Ciencias así como a Medicina y Ciencias de la Salud.

Browse

Search Results

Now showing 1 - 10 of 17
  • Tesis de maestría
    Deep learning framework to predict and generate new fluorescent molecules from experimental data
    (Instituto Tecnológico y de Estudios Superiores de Monterrey, 2025-07) Azizi, Mina; Aguirre Soto, Héctor Alán; emipsanchez; Ray, Mallar; Bernal Neira, David Esteban; Mendoza Cortés, José Luis; School of Engineering and Sciences; Campus Monterrey; Flores Tlacuahuac, Antonio
    Fluorescent molecules play important roles in biological imaging, diagnostics, and materials science. However, identifying efficient and effective fluorophores remains challenging, as traditional trial-and-error experimentation and in silico computations are both costly and time-consuming. To address this, this thesis presents a deep learn- ing approach to streamline the discovery process by predicting optical properties and generating novel fluorescent molecules directly from experimental data. The study is based on FluoDB, a publicly available dataset collected from the literature, containing over 55,000 fluorophore–solvent pairs with experimentally measured optical prop- erties. Graph Convolutional Network (GCN) models were trained to predict four key optical properties and effec- tively captured complex structure–property relationships, achieving R² values ranging from 0.49 to 0.87 across the different targets. A Conditional Variational Autoencoder (CVAE) was also implemented to generate novel fluores- cent molecules based on solvent identity and target absorption range. In total, 2573 valid and structurally diverse molecules were generated, with a variety of predicted optical behaviors. Together, the predictive model and genera- tive models provide a useful and data-driven approach to accelerate exploration and design of functional fluorescent materials.
  • Tesis de maestría
    Transformer-based hand landmark prediction from superficial electromyography
    (Instituto Tecnológico y de Estudios Superiores de Monterrey, 2025-06) Ramos García deAlba, Diego Armando; Chairez Oria, Jorge Isaac; emipsanchez; Sánchez Ante, Gildardo; School of Engineering and Sciences; Campus Monterrey; Fuentes Aguilar, Rita Quetziquel
    The development of human-robot systems has become increasingly prominent in recent years, particularly in domains that require seamless and intuitive interactions between humans and machines, such as healthcare, manufacturing, rehabilitation, and entertainment. Within these fields, upper-limb robotics and prosthetics have experienced significant growth, where control strategies play a central role in user experience, functionality, and long-term usability. Among the available strategies, myoelectric control, which utilizes electrical activity gener- ated by muscle contractions to drive robotic actuators, stands out for its potential to provide direct and responsive user intent decoding. Despite its promise, current commercial myoelectric control systems suffer from no- table limitations. Most commercially available upper-limb robotic devices rely on binary or proportional control paradigms. Binary control allows the user to initiate simple, on/off com- mands (e.g., open or close a prosthetic hand). In contrast, a proportional control scales the degree of motion or force based on the magnitude of the input signal. While these methods are relatively straightforward to implement and train, they inherently limit the functionality of robotic systems by constraining them to a narrow range of discrete, non-adaptive actions. As a result, users often experience frustration due to unnatural movements, lack of fluidity, and the inability to perform complex, multi-joint, or continuous tasks. Pattern recognition (PR)-based control has emerged as a more advanced alternative to binary or proportional schemes. PR control systems employ machine learning algorithms to classify muscle activity into predefined gesture categories. This approach improves intuitive- ness by enabling the recognition of multiple movements and gestures, offering a more versatile control interface. However, PR control is also restricted in significant ways. Its effectiveness is typically bound by the limited number of gestures used during the training phase, making the system inflexible to untrained motions or novel hand configurations. Additionally, the reliance on discrete classification does not accommodate continuous, dynamic control, which is crucial for achieving truly natural and precise robotic movement. To address these limitations, we propose a novel method for predicting continuous hand movement using surface electromyography (sEMG) signals through the application of a mul- timodal transformer architecture. Unlike traditional PR systems that output gesture classes, our approach is designed to estimate continuous hand landmark positions, thereby enabling fluid and unrestricted movement trajectories. This method represents a paradigm shift in my- oelectric control, moving from classification-based strategies to continuous regression-based motion estimation. The proposed system employs a transformer model—a deep learning architecture orig- inally designed for natural language processing—that excels at capturing complex temporal and contextual relationships in sequential data. In the context of sEMG, transformer mod- els offer several key advantages over traditional convolutional or recurrent neural networks. First, transformers eliminate the need for handcrafted feature engineering, which has his- torically been a challenging and subjective component of EMG signal processing. Instead, the transformer architecture inherently learns relevant features from raw sEMG input data through self-attention mechanisms. Second, transformers can simultaneously model spatial and temporal dependencies within the input sequence. This is crucial for decoding sEMG signals, which exhibit both spatial complexity across different muscle groups and temporal dynamics associated with movement initiation and execution. Finally, transformer models are more parameter-efficient and scalable, making them adaptable to different limb configu- rations, electrode placements, and control environments. Our multimodal architecture takes sEMG signals as input and outputs the continuous positions of hand landmarks—key spatial reference points on the hand that define its posture and motion. By focusing on hand landmark prediction rather than gesture classification, the system bypasses the inherent limitations associated with a finite gesture vocabulary. This allows users to perform an unlimited range of movements, including intermediate postures and transitions between gestures, without retraining or expanding the model’s gesture set. The landmark-based output also facilitates integration with existing computer vision and robotic control systems, many of which use landmark-based representations for motion planning and kinematic modeling. The development and validation of our approach involved collecting synchronized sEMG and hand motion data from a cohort of participants performing a variety of hand movements. Hand landmarks were extracted using vision-based tracking systems, serving as ground truth labels for model training and evaluation. The transformer model was trained to map multi- channel sEMG signals to the corresponding hand landmark coordinates over time. Extensive experimentation demonstrated that our model not only outperformed baseline architectures in terms of accuracy and generalization but also required less training data due to the efficiency of the self-attention mechanism. Qualitative evaluations further confirmed that the predicted hand trajectories were smooth, natural, and closely aligned with actual user intent, indicating the system’s potential for real-time application in robotic prosthetics and exoskeletons. In summary, this study introduces a transformative approach to myoelectric control by leveraging a multimodal transformer architecture for continuous hand movement prediction. By shifting the focus from discrete gesture classification to continuous motion estimation via landmark regression, our method addresses several long-standing challenges in the field, including limited gesture scalability, unnatural control behavior, and reliance on handcrafted features. The use of transformers for modeling spatiotemporal dependencies in sEMG data represents a significant advancement, opening the door to more intuitive, responsive, and user- centric human-robot interaction systems.
  • Tesis de maestría
    Beyond images: convnext vs. vision-language models for automated breast density classification in screening mammography
    (Instituto Tecnológico y de Estudios Superiores de Monterrey, 2025-06) Molina Román, Yusdivia; Santos Díaz, Alejandro; emipsanchez; Menasalvas Ruiz, Ernestina; Tamez Pena, José; Montesinos Silva, Luis Arturo; School of Engineering and Sciences; Campus Estado de México
    This study evaluates and compares the effectiveness of different deep learning approaches for automated breast density classification according to the BI-RADS system. Specifically, the research examines two distinct architectures: ConvNeXt, a CNN-based model, and BioMed- CLIP, a vision-language model that integrates textual information through token-based labels. Using mammographic images from TecSalud at Tecnol´ogico de Monterrey, the study assesses these models across three distinct learning paradigms: zero-shot classification, linear probing with token-based descriptions, and fine-tuning with numerical class labels. The experimental results demonstrate that while vision-language models offer theoretical advantages in terms of interpretability and zero-shot capabilities, based CNN architectures with end-to-end fine-tuning currently deliver superior performance for this specialized medical imaging task. ConvNeXt achieves an accuracy of up to 0.71 and F1 scores of 0.67, compared to BioMedCLIP’s best performance of 0.57 accuracy with linear probing. A comprehensive analysis of classification patterns revealed that all models encountered difficulties in distinguishing between adjacent breast density categories, particularly heterogeneously dense tissue. This challenge mirrors known difficulties in clinical practice, where even experienced radiologists exhibit inter-observer variability in density assessment. The performance discrepancy between models was further examined through detailed loss curve analysis and confusion matrices, revealing specific strengths and limitations of each approach. A key limitation in BioMedCLIP’s performance stemmed from insufficient semantic richness in the textual tokens representing each density class. When category distinctions relied on subtle linguistic differences—such as ”extremely” versus ”heterogeneously”—the model struggled to form robust alignments between visual features and textual descriptions. The research contributes to the growing body of knowledge on AI applications in breast imaging by systematically comparing traditional and multimodal approaches under consistent experimental conditions. The findings highlight both the current limitations and future potential of vision-language models in mammographic analysis, suggesting that enhanced textual descriptions and domain-specific adaptations could potentially bridge the performance gap while preserving the interpretability benefits of multimodal approaches for clinical applications.
  • Tesis de maestría / master thesis
    PassID: A Modular System for Pass Detection with Integrated Player Identification in Football
    (Instituto Tecnológico y de Estudios Superiores de Monterrey, 2024-12) Gutiérrez Padilla, Benjamín; Monroy Borja, Raúl; emimmayorquin; Gutiérrez Rodríguez, Andrés Eduardo; School of Engineering and Sciences; Campus Monterrey; Conant Pablos, Santiago Enrique
    The analysis of football passes plays a crucial role in understanding team tactics and improving performance. However, current methods for capturing and analyzing this data are often inaccessible due to high costs and reliance on proprietary datasets. This thesis presents the development of an automated system designed to detect passes in football matches using video as the source of information. The system integrates computer vision and machine learning techniques across mul tiple modules, including player and ball detection, object tracking, team identification, and pass detection. Using a hybrid approach with YOLOv9 for player detection, FasterRCNN for the ball, and Norfair for tracking, the system assigns unique identifiers to players and determines passes based on proximity and ball possession changes. Team identification is achieved through color histogram analysis, allowing the system to distinguish valid passes between players of the same team. The modular design enables independent improvements in each component, providing a flexible framework that can be adapted to different match conditions. This work represents a step forward in automating football pass detection, contributing to the growing field of sports analysis with a scalable and efficient solution.
  • Tesis de maestría
    Evaluating Pre-trained Neural Networks in Deep Learning for Early Detection and Enhanced Screening of Cervical Pathology
    (Instituto Tecnológico y de Estudios Superiores de Monterrey, 2024-12) González Ortiz, Orlando; Muñoz Ubando, Luis Alberto; emimmayorquin; Raymundo Avilés, Arturo; Cerón López Universidad, Arturo Eduardo; School of Engineering and Sciences; Campus Monterrey; Ochoa Ruiz, Gilberto
    This document presents a research thesis for the Master in Computer Science (MCC) degree at Tecnologico de Monterrey. Cervical cancer remains a leading cause of mortality among women, particularly in low-resource regions where screening tools such as the Pap smear often fall short in early detection. This research explores the application of deep learning and pre-trained neural networks for the binary classification of cervical pathology, focusing on detecting dysplasia, specifically CIN2 and CIN3, as a potential prevention tool. We im- plemented multiple neural network models, including DenseNet, EfficientNet, MobileNet, and ResNet. The models were evaluated on two distinct datasets: one from the International Agency for Research on Cancer (IARC) and another from the Zambrano Hospital. To as- sess the generalization capacity of these models, we employed a sequential training approach where the first batch was trained with IARC data and tested on a Zambrano Hospital batch, with subsequent tests progressively incorporating prior results. Each experiment was repeated over 10 iterations to calculate confidence intervals for the performance metrics. Our results demonstrate that DenseNet and EfficientNet outperformed other models, achieving superior sensitivity and accuracy compared to conventional Pap smear tests. These findings indicate that deep learning models hold promise as an affordable, effective cervical cancer screening tool in low-resource communities. Future work will focus on augmenting datasets through collaboration with healthcare institutions and exploring generative models such as GANs to improve model robustness and generalization.
  • Tesis de maestría / master thesis
    Maturity recognition and fruit counting for sweet peppers in greenhouses using deep Learning neural networks
    (Instituto Tecnológico y de Estudios Superiores de Monterrey, 2024-01-05) Viveros Escamilla, Luis David; Gómez Espinosa, Alfonso; mtyahinojosa, emipsanchez; Cantoral Ceballos, José Antonio; Escuela de Ingenieria y Ciencias; Campus Querétaro; Escobedo Cabello, Jesús Arturo
    This study presents an approach to address the challenges involved in recognizing the maturity stage and counting sweet peppers of varying colors (green, yellow, orange, and red) within greenhouse environments. The methodology leverages the YOLOv5 model for real-time object detection, classification, and localization, coupled with the DeepSORT algorithm for efficient tracking. The system was successfully implemented to monitor sweet pepper production, and some challenges related to this environment, namely occlusions and the presence of leaves and branches, were effectively overcome. The algorithm was evaluated using real-world data collected in a sweet pepper greenhouse. A dataset comprising 1863 images was meticulously compiled to enhance the study, incorporating diverse sweet pepper vari eties and maturity levels. Additionally, the study emphasized the role of confidence levels in object recognition, achieving a confidence level of 0.973. Furthermore, the DeepSORT algo rithm was successfully applied for counting sweet peppers, demonstrating an accuracy level of 85.7% in two simulated environments under challenging conditions, such as varied lighting and inaccuracies in maturity level assessment.
  • Tesis de maestría / master thesis
    Sign language recognition with tree structure skeleton images and densely connected convolutional neural networks
    (Instituto Tecnológico y de Estudios Superiores de Monterrey, 2023-05) Laines Vázquez, David Alberto; González Mendoza, Miguel; puemcuervo; Sánchez Ante, Gildardo; Cantoral Ceballos, José Antonio; Méndez Vázquez, Andrés; School of Engineering and Sciences; Campus Monterrey; Ochoa Ruiz, Gilberto
    This thesis presents a novel approach to Isolated Sign Language Recognition (ISLR) using skeleton modality data and deep learning. The study proposes a method that employs an image-based spatio-temporal skeleton representation for sign gestures and a convolu tional neural network (CNN) for classification. The advantages of the skeleton modality over RGB, such as reduced noise and smaller parameter requirements for processing, are taken into account. The aim is to achieve competitive performance with a low number of parameters compared to the existing state-of-the-art in ISLR. Informed by the literature on skeleton-based human action recognition (HAR), this research adapts the Tree Structure Skeleton Image (TSSI) method to represent a sign gesture as an image. The process in volves first extracting the skeleton sequences from sign videos using the MediaPipe frame work, which offers fast inference performance across multiple devices. The TSSI represen tation is then processed using a DenseNet, chosen for its efficiency and fewer parameters. The proposed method, called SL-TSSI-DenseNet, is trained and evaluated on two chal lenging datasets: the Word level American Sign Language (WLASL) dataset and the Ankara University Turkish Sign Language (AUTSL) dataset. Specifically, the WLASL-100 subset of the WLASL dataset and the RGB Track of the AUTSL dataset are selected for the experi ments. The results demonstrate that SL-TSSI-DenseNet outperforms other skeleton-based and RGB-based models benchmarked on the WLASL-100 dataset, achieving an accuracy of 81.47% through the use of data augmentation and pre-training. On the AUTSL dataset, it achieves competitive performance with an accuracy of 93.13% without pre-training and data augmentation. Additionally, an augmentation ablation study is conducted to iden tify the most effective data augmentation technique for the model’s performance on the WLASL-100 dataset. Furthermore, it provides insights into the effectiveness of various data augmentation techniques.
  • Tesis de maestría / master thesis
    Lights, camera, and domain shift: using superpixels for domain generalization in image segmentation for multimodal endoscopies
    (Instituto Tecnológico y de Estudios Superiores de Monterrey, 2023-05) Martínez García Peña, Rafael; Ochoa Ruiz, Gilberto; puemcuervo, emipsanchez; Falcón Morales, Luis Eduardo; Gónzales Mendoza, Miguel; School of Engineering and Sciences; Campus Monterrey; Ali, Sharib
    Deep Learning models have made great advancements in image processing. Their ability to identify key parts of images and provide fast and accurate segmentation has been proven and used in many fields, such as city navigation and object recognition. However, there is one field that is both in need of the extra information that computers can provide and has proven elusive for the goals of robustness and accuracy: Medicine. In the medical field, limitations in the amount of data and in the variation introduced by factors such as differences in instrumentation introduce a grave threat to the accuracy of a model known as domain shift. Domain shift occurs when we train with data that has a set of characteristics that is not wholly representative of the entire set of data a task encompasses. When it is present, models that have no tools to deal with it can observe a degradation to their accuracy to such degree that they can be transformed from usable to useless. To better explore this topic, we discuss two techniques: Domain adaptation, where we find how to make a model better at predicting for specific domain of data inside a task, and Domain generalization, where we find how to make a model better at predicting data for any domain inside a task. In addition, we discuss several image segmentation models that have shown good results for medical tasks: U-Net, Attention U-Net, DeepLab, Efficient U-Net, and EndoUDA. Following this exploration, we propose a solution model based on a domain generalization technique: Patch-based consistency. We use a superpixel generator known as SLIC (Simple Linear Iterative Clustering) to provide low-level, domain-agnostic information to different models in order to encourage our networks to learn more global features. This framework, which we refer to as SUPRA (SUPeRpixel Augmented), is used in tandem with U-Net, Attention U-Net, and Efficient U-Net in an effort to improve results in endoscopies where light modalities are switched: Something commonly seen in lesion detection tasks (particularly in Barrett's Esophagus and Polyp detection). We find that the best of these models, SUPRA-UNet, shows significant qualities that make it a better choice than unaugmented networks for lesion detection: Not only does it provide less noisy and smoother predictions, but it outperforms other networks by over 20% IoU versus the best results (U-Net) in a target domain that presents significant lighting differences from the training set.
  • Tesis de maestría / master thesis
    Advanced deep learning approaches for maritime trajectory prediction leveraging automatic identification system data
    (Instituto Tecnológico y de Estudios Superiores de Monterrey, 2023) Familsamavati, Sajad; Hajiaghaei Keshteli, Mostafa; emiggomez, emimmayorquin; Guadalupe Villarreal Marroquín, María; School of Engineering and Sciences; Campus Monterrey; Smith Cornejo, Neale Ricardo
    This study investigates the efficacy of advanced DL models, specifically Bi-GRU, LSTM, and Bi-LSTM, for predicting maritime vessel trajectories using AIS data. The study focuses on doing comparative analysis of prediction accuracy in high-traffic maritime environments, particularly the Port of Manzanillo. Comprehensive AIS data preprocessing, feature engineering, and normalization were conducted to prepare the data for model training. The Bi-GRU model emerged as the most effective, demonstrating superior performance with the lowest test loss, MAE, and MSE, highlighting its capability in capturing sequential dependencies in vessel trajectories. The research contributes significantly to maritime traffic management by offering a predictive framework that enhances safety and efficiency in dynamic maritime operations. Future research directions include integrating additional data sources and extending model applications across various maritime regions.
  • Tesis de maestría / master thesis
    Component Detection based on Mask R CNN
    (Instituto Tecnológico y de Estudios Superiores de Monterrey, 2023) Charles Garza, Daniel; Morales, Rubén; emimmayorquin; Vallejo Guevara, Antonio; Guedea Elizalde, Federico; Escuela de Ingeniería y Ciencias; Campus Monterrey
    This thesis delves into the evolution and utilization of deep learning methodologies in the specific context of object detection and segmentation within the manufacturing industry. It thoroughly examines several state-of-the-art object detection techniques, including YOLO, RCNN, Fast R-CNN, etc. These methods are explored in detail, assessing their effectiveness and applicability in complex object identification and classification tasks. The study then focuses on Mask R-CNN, a method chosen for its outstanding performance in object segmentation and identification; especially, in cluttered and unstructured environments common in manufacturing settings.
En caso de no especificar algo distinto, estos materiales son compartidos bajo los siguientes términos: Atribución-No comercial-No derivadas CC BY-NC-ND http://www.creativecommons.mx/#licencias
logo

El usuario tiene la obligación de utilizar los servicios y contenidos proporcionados por la Universidad, en particular, los impresos y recursos electrónicos, de conformidad con la legislación vigente y los principios de buena fe y en general usos aceptados, sin contravenir con su realización el orden público, especialmente, en el caso en que, para el adecuado desempeño de su actividad, necesita reproducir, distribuir, comunicar y/o poner a disposición, fragmentos de obras impresas o susceptibles de estar en formato analógico o digital, ya sea en soporte papel o electrónico. Ley 23/2006, de 7 de julio, por la que se modifica el texto revisado de la Ley de Propiedad Intelectual, aprobado

DSpace software copyright © 2002-2026

Licencia