Ciencias Exactas y Ciencias de la Salud
Permanent URI for this collectionhttps://hdl.handle.net/11285/551014
Pertenecen a esta colección Tesis y Trabajos de grado de los Doctorados correspondientes a las Escuelas de Ingeniería y Ciencias así como a Medicina y Ciencias de la Salud.
Browse
Search Results
- Generalisable computer vision methods for endoscopic surveillance and surgical interventions(Instituto Tecnológico y de Estudios Superiores de Monterrey, 2025-12-05) Ali, Mansoor; Ochoa Ruiz, Gilberto; emimmayorquin, emipsanchez; School of Engineering and Sciences; Campus Estado de México; Ali, SharibAmong the most prevalent cancers in humans are gastrointestinal (GI) cancers, which mostly include cancers originating from the esophagus, stomach, and colon. Endoscopy for the upper gastrointestinal (GI) tract and colonoscopy for the lower side are considered the gold standard techniques for screening and removing precancerous lesions and abnormal tissue growth like polyps with high sensitivity. Prior research has shown higher polyp miss rates due to their peculiar morphology, variability in shape or size, and appearance. Also, endoscopic surgical interventions offer a minimally invasive approach for lesion removal or for the treatment of other diseases inside the abdominal and reproductive organs. Despite being patient-friendly in reducing trauma, hospitalisation times, and quicker post-operative recovery, minimally invasive surgeries may become complicated due to increased cognitive burden and reduced field-of-view for the clinicians. Computer-assisted detection (CADe), diagnosis (CADx), and interventions (CAI) have shown promise in providing useful support to the clinicians in both disease diagnosis and treatment, with immense potential to further improvements as the data availability becomes easier due to the endoscopes. Deep learning is increasingly being leveraged to develop methods for improving the pre-cancerous lesion detection and diagnosis, reducing the missing rates and providing intraoperative assistance to surgeons for better decision-making. However, current methods suffer from the domain shift problem, i.e., they work well on the same distribution of data and perform poorly on out-of-the-distribution data, thus lacking the real-world deployment capability. This thesis explores the impact of domain shift in endoscopic domain data on the current state-of-the-art methods, investigates the research gaps, and proposes methods for improved disease detection, surveillance, and surgical interventions with better generalisation capability. Specifically, we aim to use the feature space of the encoder networks of the state-of-the-art segmentation methods to learn discriminant information for better domain-invariant learning and improving the model generalisation on unseen out-of-the-distribution endoscopic datasets. We propose various methods for polyp segmentation in upper and lower GI tract data, full scene segmentation in laparoscopic surgery, and depth estimation in abdominal surgery. We also introduce an annotated multicentre segmentation dataset for evaluating model performance on generalisability and encouraging further research. Our results indicate improved out-of-distribution performance on multi-domain and cross-center endoscopic data. We will further work on extending the data to enhance its size and variability and explore new methods to increase robustness and generalisation performance.
- Enhanced medical image explainability through prototypical-parts learning(Instituto Tecnológico y de Estudios Superiores de Monterrey, 2025-11-01) Flores Araiza, Daniel; Ochoa Ruiz, Gilberto; emimmayorquin, emipsanchez; Méndez Vázquez, Andrés; Miquel Mata, Christian; Hinojosa Cervantes, Salvador Miguel; Computer Sciences; Campus Estado de México; González Mendoza, MiguelThe application area of this work is mainly on the case study of the identification of kidney stones. Currently the standard identification process involves extraction of the stone, surface and sectional visual inspection of it and a Fourier-Transform Infrared spectroscopy (FTIR) analysis. This process is known as Morpho-Constitutional Analysis (MCA) and currently it can take up to a couple of weeks or even a month. Since identification is essential to prescribe a treatment and in some cases formation of new kidney stones can be in only a couple of weeks there is the need to develop a faster and reliable method of identification of kidney stones. We want to get recommended classifications of the endoscopic images of kidney stones that are easy to analyze. Those classifications should be produced along with pertinent information about the causes of such classifications. This information should allow specialists to confirm if the kidney stones were or were not correctly classified. Then, a Computer Assisted Diagnosis (CADx) tool for the classification of kidney stones is the goal pursued. Recently, the Deep Learning (DL) field has shown good results in many different areas, but at the expense of relying on models with millions or billions of parameters. As a result, it impedes human interpretation of the behavior of these models. In the literature, this limitation is frequently mentioned as the Black-Box nature of DL models. Due to the high importance of mitigating this aspect in the adoption of DL models in this work self-explainable methods are explored. For this thesis, convolutional neural networks are leveraged as feature extractors from images on which prototypical parts learned by a DL model will identify their level of similarity to determine the presence or absence of the characteristic parts that compose certain image classes. This approach is said to be considered self-explainable since the model will identify the relevant parts of an image that are considered to be due to the presence of a certain type of entity, which is been classified, and based on these detections, generate visualizations that the model itself deems the most similar previously to the final classification of the input image. This behavior, hence, allows us to visualize and corroborate if the identified parts indeed correspond to parts relevant and indicative of the type of image the DL model concludes is the class of the input image. Additionally, the levels of similarity and their example cases are evaluated to determine the main visual characteristics behind those models' activations and their possible causal relationship to the model's final output. You can find the base code for the development of the different experiments explored in this dissertation at: https://github.com/DanielF29/PPs_ICNN_Loss

