Ciencias Exactas y Ciencias de la Salud

Now showing 1 - 2 of 2

Contextual information for Person Re-Identification on outdoor environements.
(Instituto Tecnológico y de Estudios Superiores de Monterrey, 2021-06) Garnica López, Luis Alberto; Chang Fernández, Leonardo; 345979; Chang Fernández, Leonardo; emipsanchez; Pérez Suárez, Airel; Gutiérrez Rodríguez, Andrés Eduardo; School of Engineering and Sciences; Campus Monterrey; González Mendoza, Miguel
Person Re-Identification (ReID) is obtaining good results and is getting closer and closer to being ready for implementation in production scenarios. However, there are still improvements to be performed, and usually the performance of this task is affected by illumination or natural elements that could distort their images such as fog or dust, when the task is implemented in outdoor environments. In this work, we introduce a novel proposal for the inclusion of contextual information in a ReID re-ranking approach, to help to improve the effectiveness of this task in surveillance systems. Most of the previous research in this field usually make use only of the visual data contained in the images processed by ReID. Even the approaches that make use of some sort of context, is normally annotated context within the scope of the image itself, or the exploration of the relationships between the different images where the Id’s are found. We understand that there is a lot of contextual information available in these scenarios that are not being included and that might help to reduce the impact of these situations on the performance of the task. In the present document, we perform a complete analysis of the effect of the inclusion of this contextual information with the normally produced embeddings generated by several ReID models, processing it through an architecture inspired in siamese neural networks, but with triplet loss. The neural network was trained using a novel dataset developed specifically for this task, which is annotated including this extra information. The dataset is composed of 34156 images from 3 different cameras of 501 labeled identities. Along with this data, each image includes 12 extra features with its specific contextual information. This dataset of images was processed previously using three different ReID models to ensure that the results obtained when the information is included, are independent of the ReID approach taken as the base, which are: Triplet Network (TriNet), Multiple Granularity Network (MGN), and Multi-Level Factorization Net (MLFN). Each one produced 2048-dimensional embeddings. All of our proposed experiments achieved an improvement with respect to the original mAP generated from these three networks. Going from 86.53 to 94.9, from 84.94 to 93.11, and from 95.35 to 95.93 respectively for our dataset.
Attention YOLACT++: achieving robust and real-time medical instrument segmentation in endoscopic procedures.
(Instituto Tecnológico y de Estudios Superiores de Monterrey, 2021-04) Ángeles Cerón, Juan Carlos; Chang Fernández, Leonardo; 345979; Chang Fernández, Leonardo; emipsanchez; González Mendoza, Miguel; Alí, Sharib; Escuela de Ingeniería y Ciencias; Campus Monterrey; Ochoa Ruiz, Gilberto
Image-based tracking of laparoscopic instruments via instance segmentation plays a fundamental role in computer and robotic-assisted surgeries by aiding surgical navigation and increasing patient safety. Despite its crucial role in minimally invasive surgeries, accurate tracking of surgical instruments is a challenging task to achieve because of two main reasons 1) complex surgical environment, and 2) lack of model designs with both high accuracy and speed. Previous attempts in the field have prioritized robust performance over real-time speed rendering them unfeasible for live clinical applications. In this thesis, we propose the use of attention mechanisms to significantly improve the recognition capabilities of YOLACT++, a lightweight single-stage instance segmentation architecture, which we target at medical instrument segmentation. To further improve the performance of the model, we also investigated the use of custom data augmentation, and anchor optimization via a differential evolution search algorithm. Furthermore, we investigate the effect of multi-scale feature aggregation strategies in the architecture. We perform ablation studies with Convolutional Block Attention and Criss-cross Attention modules at different stages in the network to determine an optimal configuration. Our proposed model CBAM-Full + Aug + Anch drastically outperforms the previous state-of-the art in commonly used robustness metrics in medical segmentation, achieving 0.435 MI_DSC and 0.471 MI_NSD while running at 69 fps, which is more than 12 points more robust in both metrics and 14 times faster than the previous best model. To our knowledge, this is the first work that explicitly focuses on both real-time performance and improved robustness.

Ciencias Exactas y Ciencias de la Salud

Browse

Filters

Settings

Sort By

Results per page

Search Results