Tamez Peña, Jose GerardoHussain, Sadam2025-07-222025-06Hussain, Sadam (2025). Deep learning and natural language processing for computer aided diagnosis [Tesis doctoral]. Instituto Tecnológico y de Estudios Superiores de Monterrey. Recuperado de: https://hdl.handle.net/11285/703887https://hdl.handle.net/11285/703887Multimodal artificial intelligence (AI) is a cutting-edge technique that integrates diverse modalities, such as imaging and textual data, to enhance classification and regression tasks. This dissertation focuses on the integration, comparison, and evaluation of multimodal AI for breast cancer diagnosis and prognosis. To achieve these objectives, we curated a comprehensive multimodal dataset comprising digital mammograms and corresponding radiological reports. Leveraging this dataset, we introduced and assessed various state-of-the-art (SOTA) multimodal techniques for three key tasks: breast cancer classification, reduction of false-positive biopsies with explainable AI (XAI), and short- term (5-year) risk prediction of breast cancer.In this work, we also introduced a benchmark dataset of radiological reports from breast cancer patients and provided baseline performance evaluations using SOTA machine learning (ML), deep learning (DL), and large language models (LLMs) for BI-RADS category classification. Our approach evaluated the performance of diverse SOTA multimodal architectures, including ResNet, VGG, E!cientNet, MobileNet, and Vision Transformers (ViT). For textual data processing, we employed both general-purpose and domain-specific pretrained LLMs such as BERT, bioGPT, ClinicalBERT, and DeBERTa, which were also integrated into multimodal architectures for enhanced lassification.Notably, our proposed multiview multimodal feature fusion (MMFF) architecture, combining SE-ResNet50 with an artificial neural network (ANN), achieved an AUC of 0.965 for breast cancer classification, significantly outperforming both single-modal and multimodal SOTA architectures. For reducing unnecessary breast biopsies, our multimodal approach achieved an AUC of 0.72, showcasing its clinical utility in minimizing patient burden. Moreover, our ViT and bioGPT-based multimodal architecture achieved an AUC of 0.77 for short-term risk prediction, outperforming the SOTA MIRAI model, which achieved an AUC of 0.59 on our in-house dataset. This work highlights the potential of multimodal AI in advancing breast cancer diagnosis and prognosis, demonstrating its superiority over traditional and unimodal approaches across multiple critical tasks.TextoengopenAccesshttp://creativecommons.org/licenses/by-nc-sa/4.0INGENIERÍA Y TECNOLOGÍA::CIENCIAS TECNOLÓGICAS::TECNOLOGÍA DE LOS ORDENADORES::INTELIGENCIA ARTIFICIALINGENIERÍA Y TECNOLOGÍA::CIENCIAS TECNOLÓGICAS::TECNOLOGÍA DE LOS ORDENADORES::SISTEMAS DE CONTROL MÉDICOCIENCIAS FÍSICO MATEMÁTICAS Y CIENCIAS DE LA TIERRA::MATEMÁTICAS::CIENCIA DE LOS ORDENADORES::DISPOSITIVOS DE TRANSMISIÓN DE DATOSMEDICINA Y CIENCIAS DE LA SALUD::CIENCIAS MÉDICAS::CIENCIAS CLÍNICAS::RADIOLOGÍAINGENIERÍA Y TECNOLOGÍA::CIENCIAS TECNOLÓGICAS::TECNOLOGÍA ELECTRÓNICA::RAYOS XTechnologyScienceDeep learning and natural language processing for computer aided diagnosisTesis de doctoradoSe solicita el embargo porque algunos capítulos aún no se han publicado y están en proceso.Multimodal LearningBreast Cancer ClassificationComputer Aided DiagnosisMedical Image AnalysisBI-RADS ClassificationRadiology Report Analysis