Generalisable computer vision methods for endoscopic surveillance and surgical interventions
Citation
Share
Abstract
Among the most prevalent cancers in humans are gastrointestinal (GI) cancers, which mostly include cancers originating from the esophagus, stomach, and colon. Endoscopy for the upper gastrointestinal (GI) tract and colonoscopy for the lower side are considered the gold standard techniques for screening and removing precancerous lesions and abnormal tissue growth like polyps with high sensitivity. Prior research has shown higher polyp miss rates due to their peculiar morphology, variability in shape or size, and appearance. Also, endoscopic surgical interventions offer a minimally invasive approach for lesion removal or for the treatment of other diseases inside the abdominal and reproductive organs. Despite being patient-friendly in reducing trauma, hospitalisation times, and quicker post-operative recovery, minimally invasive surgeries may become complicated due to increased cognitive burden and reduced field-of-view for the clinicians. Computer-assisted detection (CADe), diagnosis (CADx), and interventions (CAI) have shown promise in providing useful support to the clinicians in both disease diagnosis and treatment, with immense potential to further improvements as the data availability becomes easier due to the endoscopes. Deep learning is increasingly being leveraged to develop methods for improving the pre-cancerous lesion detection and diagnosis, reducing the missing rates and providing intraoperative assistance to surgeons for better decision-making. However, current methods suffer from the domain shift problem, i.e., they work well on the same distribution of data and perform poorly on out-of-the-distribution data, thus lacking the real-world deployment capability. This thesis explores the impact of domain shift in endoscopic domain data on the current state-of-the-art methods, investigates the research gaps, and proposes methods for improved disease detection, surveillance, and surgical interventions with better generalisation capability. Specifically, we aim to use the feature space of the encoder networks of the state-of-the-art segmentation methods to learn discriminant information for better domain-invariant learning and improving the model generalisation on unseen out-of-the-distribution endoscopic datasets. We propose various methods for polyp segmentation in upper and lower GI tract data, full scene segmentation in laparoscopic surgery, and depth estimation in abdominal surgery. We also introduce an annotated multicentre segmentation dataset for evaluating model performance on generalisability and encouraging further research. Our results indicate improved out-of-distribution performance on multi-domain and cross-center endoscopic data. We will further work on extending the data to enhance its size and variability and explore new methods to increase robustness and generalisation performance.
Description
https://orcid.org/0000-0002-9896-8727