Exploring Anchor-Free Object Detection for Surgical Tool Detection in Laparoscopic Videos: A Comparative Study of CenterNet++ and Anchor-Based Models

dc.audience.educationlevelOtros/Other
dc.contributor.advisorOchoa Ruiz, Gilberto
dc.contributor.authorAparicio Viveros, Carlos Alfredo
dc.contributor.catalogeremipsanchez
dc.contributor.committeememberHinojosa Cervantes, Salvador Miguel
dc.contributor.committeememberFalcón Morales, Luis Eduardo
dc.contributor.committeememberGonzález Mendoza, Miguel
dc.contributor.departmentSchool of Engineering and Sciences
dc.contributor.institutionCampus Monterrey
dc.date.accepted2024-11-08
dc.date.accessioned2024-12-23T07:00:14Z
dc.date.issued2024-12
dc.descriptionhttps://orcid.org/0000-0002-9896-8727
dc.description.abstractMinimally Invasive Surgery (MIS) has transformed modern medicine, offering reduced re covery times, minimal scarring, and lower risks of infection. However, MIS procedures alsopresent unique challenges, particularly in visualizing and manipulating surgical tools within a limited field of view. As a solution, this thesis investigates anchor-free deep learning mod els for real-time surgical tool detection in laparoscopic videos, proposing CenterNet++ as apotential improvement over traditional anchor-based methods. The hypothesis guiding thiswork is that anchor-free detectors, by avoiding predefined anchor boxes, can more effectively handle the diverse shapes, sizes, and positions of surgical tools. The primary objective of this thesis is to evaluate the performance of CenterNet++ in surgical tool detection compared to popular anchor-based models, specifically Faster R-CNN and YOLOv4, using the m2cai16-tool-locations dataset. CenterNet++ is examined in dif ferent configurations—including complete and real-time optimized (Fast-CenterNet++) ver sions—and tested against Faster R-CNN and YOLOv4 to assess trade-offs in accuracy and efficiency. Experimental results demonstrate that while CenterNet++ achieves high precision, particularly in scenarios requiring meticulous localization, its inference speed is significantly slower than YOLOv4, which attained real-time speeds at 128 FPS. CenterNet++’s unique keypoint refinement mechanism, though beneficial for localization, impacts its computational efficiency, highlighting areas for further optimization. To bridge this gap, several architectural improvements are proposed based on YOLOv4’s streamlined design. These include integrating modules like Spatial Pyramid Pooling (SPP) and Path Aggregation Network (PANet), along with reducing input resolution in the Fast CenterNet++ configuration. Additionally, future work is suggested to explore CenterNet++ in larger, more complex datasets and to develop semi-supervised learning approaches that could mitigate the limitations of annotated surgical datasets. In conclusion, this thesis contributes a comprehensive evaluation of anchor-free models for surgical tool detection, providing a foundation for further advancements in real-time, high precision object detection for surgical assistance. The findings underscore the potential of anchor-free models, such as CenterNet++, to meet the evolving demands of MIS with targeted architectural adaptations.
dc.description.degreeMaster of Science in Computer Science
dc.format.mediumTexto
dc.identificator331499
dc.identifier.citationAparicio Viveros, C. A. (2024), Exploring Anchor-Free Object Detection for Surgical Tool Detection in Laparoscopic Videos: A Comparative Study of CenterNet++ and Anchor-Based Models [Tesis maestría]. Instituto Tecnológico y de Estudios Superiores de Monterrey. Recuperado de: https://hdl.handle.net/11285/702940
dc.identifier.cvu1105337
dc.identifier.urihttps://hdl.handle.net/11285/702940
dc.identifier.urihttps://doi.org/10.60473/ritec.17
dc.language.isoeng
dc.publisherInstituto Tecnológico y de Estudios Superiores de Monterrey
dc.relation.isFormatOfacceptedVersion
dc.rightsopenAccess
dc.rights.urihttp://creativecommons.org/licenses/by-nc/4.0
dc.subject.classificationINGENIERÍA Y TECNOLOGÍA::CIENCIAS TECNOLÓGICAS::TECNOLOGÍA MÉDICA::OTRAS
dc.subject.keywordObject Detection
dc.subject.keywordSurgical Tool Detection
dc.subject.keywordComputer Vision
dc.subject.keywordLaparoscopic Surgery
dc.subject.lcshTechnology
dc.subject.lcshMedicine
dc.titleExploring Anchor-Free Object Detection for Surgical Tool Detection in Laparoscopic Videos: A Comparative Study of CenterNet++ and Anchor-Based Models
dc.typeTesis de Maestría / master Thesis

Files

Original bundle

Now showing 1 - 3 of 3
Loading...
Thumbnail Image
Name:
AparicioViveros_TesisMaestriapdfa.pdf
Size:
7.88 MB
Format:
Adobe Portable Document Format
Description:
Tesis Maestría
Loading...
Thumbnail Image
Name:
AparicioViveros_ActagradoDeclaracionAutoriapdfa.pdf
Size:
355.95 KB
Format:
Adobe Portable Document Format
Description:
Acta de Grado y Declaración Autoría
Loading...
Thumbnail Image
Name:
ApariciosViveros_CartaAutorizacionpdfa.pdf
Size:
127.78 KB
Format:
Adobe Portable Document Format
Description:
Carta Autorización

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.28 KB
Format:
Item-specific license agreed upon to submission
Description:
logo

El usuario tiene la obligación de utilizar los servicios y contenidos proporcionados por la Universidad, en particular, los impresos y recursos electrónicos, de conformidad con la legislación vigente y los principios de buena fe y en general usos aceptados, sin contravenir con su realización el orden público, especialmente, en el caso en que, para el adecuado desempeño de su actividad, necesita reproducir, distribuir, comunicar y/o poner a disposición, fragmentos de obras impresas o susceptibles de estar en formato analógico o digital, ya sea en soporte papel o electrónico. Ley 23/2006, de 7 de julio, por la que se modifica el texto revisado de la Ley de Propiedad Intelectual, aprobado

DSpace software copyright © 2002-2025

Licencia