Enhancing video-based human action recognition: leveraging knowledge distillation for improved training efficiency and flexibility

Camarena Trinidad, Luis Fernando

Tesis doctorado / doctoral thesis

Enhancing video-based human action recognition: leveraging knowledge distillation for improved training efficiency and flexibility

Files

Download

Request a copy

Citation

View formats

Share

Bibliographic managers

Mendeley

View statistical information

Abstract

Artificial Intelligence (AI) stands out for its transformative potential, revolutionizing sectors from healthcare and transport to e-commerce and industrial maintenance. A core task of AI applications is to be able to understand human behavior in videos, which is the foundation in areas like surveillance, content monitoring, patient care, and gaming. Training a model to recognize human actions implies a highly complex computational process in which modern strategies use a knowledge transfer approach to reduce computational complexity. However, they come with challenges, especially in flexibility and efficiency. Existing solutions are limited in functionality, relying heavily on pretrained model architectures, which can restrict their applicability in diverse scenarios. Our research, titled ”Enhancing Video-Based Human Action Recognition: Leverag- ing Knowledge Distillation for Improved Training Efficiency and Flexibility”, proposes a framework that uses knowledge distillation (KD) to guide the training of self-supervised models. This framework has significant practical implications, as it improves classification accuracy, accelerates model convergence, and increases model flexibility under regular and limited data scenarios. We tested our method on the UCF101 dataset, varying the balanced proportions from 100 % to 2 %, and measured their performance at different training stages. Our results show that our approach outperforms traditional training methods, maintaining classification accuracy while improving the convergence rate. In addition to the efficiency of the model training, our methods enable cross-architecture adaptability, allowing model customization for various applications. In data-scarce environments, KD maintains its robustness, proving invaluable for applications where gathering extensive labeled data is challenging or expensive.

Description

https://orcid.org/0000-0001-6451-9109

Collections

Loading...

Full item page

Document viewer

Select a file to preview:

Reload