ANOSCAR: An image captioning model and dataset designed from OSCAR and the video dataset of activitynet

dc.audience.educationlevelInvestigadores/Researcherses_MX
dc.contributor.advisorGonzález Mendoza, Miguel
dc.contributor.authorByrd Suárez, Emmanuel
dc.contributor.catalogerpuemcuervoes_MX
dc.contributor.committeememberOchoa Ruiz, Gilberto
dc.contributor.committeememberMarín Hernandez, Antonio
dc.contributor.departmentSchool of Engineering and Scienceses_MX
dc.contributor.institutionCampus Estado de Méxicoes_MX
dc.contributor.mentorChang Fernández, Leonardo
dc.creatorGONZALEZ MENDOZA, MIGUEL; 123361
dc.date.accepted2021-07-01
dc.date.accessioned2023-04-26T17:36:14Z
dc.date.available2023-04-26T17:36:14Z
dc.date.created2021-05-16
dc.date.issued2021-07-01
dc.descriptionhttps://orcid.org/0000-0001-6451-9109es_MX
dc.description.abstractActivity Recognition and Classification in video sequences is an area of research that has received attention recently. However, video processing is computationally expensive, and its advances have not been as extraordinary compared to those of Image Captioning. This work uses a computationally limited environment and learns an Image Captioning transformation of the ActivityNet-Captions Video Dataset that can be used for either Video Captioning or Video Storytelling. Different Data Augmentation techniques for Natural Language Processing are explored and applied to the generated dataset in an effort to increase its validation scores. Our proposal includes an Image Captioning dataset obtained from ActivityNet with its features generated by Bottom-Up attention and a model to predict its captions, generated with OSCAR. Our captioning scores are slightly better than those of S2VT, but with a much simpler pipeline, showing a starting point for future research using our approach, which can be used for either Video Captioning or Video Storytelling. Finally, we propose different lines of research to how this work can be further expanded and improved.es_MX
dc.description.degreeMaster of Science in Computer Sciencees_MX
dc.format.mediumTextoes_MX
dc.identificator7||33||3304||120323es_MX
dc.identifier.citationByrd Suárez, E.(2021). ANOSCAR: An image captioning model and dataset designed from OSCAR and the video dataset of activitynet [Unpublished master's thesis]. Instituto Tecnológico de Estudios Superiores de Monterrey.es_MX
dc.identifier.orcidhttps://orcid.org/0000-0002-9614-8944es_MX
dc.identifier.urihttps://hdl.handle.net/11285/650436
dc.language.isoenges_MX
dc.publisherInstituto Tecnológico y de Estudios Superiores de Monterreyes_MX
dc.relationCONACYTes_MX
dc.relation.isFormatOfdraftes_MX
dc.relation.isreferencedbyREPOSITORIO NACIONAL CONACYT
dc.rightsopenAccesses_MX
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0es_MX
dc.subject.classificationINGENIERÍA Y TECNOLOGÍA::CIENCIAS TECNOLÓGICAS::TECNOLOGÍA DE LOS ORDENADORES::LENGUAJES DE PROGRAMACIÓNes_MX
dc.subject.keywordVideo captioninges_MX
dc.subject.keywordImage captioninges_MX
dc.subject.keywordActivity recognitiones_MX
dc.subject.keywordComputer sciencees_MX
dc.subject.keywordDeep learninges_MX
dc.subject.keywordComputer visiones_MX
dc.subject.keywordArtificial neural networkses_MX
dc.subject.keywordSoft computinges_MX
dc.subject.lcshSciencees_MX
dc.titleANOSCAR: An image captioning model and dataset designed from OSCAR and the video dataset of activitynetes_MX
dc.typeTesis de maestría

Files

Original bundle

Now showing 1 - 4 of 4
Loading...
Thumbnail Image
Name:
Tesis Emmanuel_Byrd_MCC_firmada_final.pdf
Size:
6.72 MB
Format:
Adobe Portable Document Format
Description:
Loading...
Thumbnail Image
Name:
byrd_poster_62.pdf
Size:
886.81 KB
Format:
Adobe Portable Document Format
Description:
Poster de conferencia
Loading...
Thumbnail Image
Name:
declaracion de acuerdo para uso de obra tesis Emmanuel 2021.png
Size:
3.85 MB
Format:
Portable Network Graphics
Description:
Declaración de acuerdo de uso
Loading...
Thumbnail Image
Name:
Firmas Emmanuel_Byrd_MCC.pdf
Size:
223.4 KB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.3 KB
Format:
Item-specific license agreed upon to submission
Description:
logo

El usuario tiene la obligación de utilizar los servicios y contenidos proporcionados por la Universidad, en particular, los impresos y recursos electrónicos, de conformidad con la legislación vigente y los principios de buena fe y en general usos aceptados, sin contravenir con su realización el orden público, especialmente, en el caso en que, para el adecuado desempeño de su actividad, necesita reproducir, distribuir, comunicar y/o poner a disposición, fragmentos de obras impresas o susceptibles de estar en formato analógico o digital, ya sea en soporte papel o electrónico. Ley 23/2006, de 7 de julio, por la que se modifica el texto revisado de la Ley de Propiedad Intelectual, aprobado

DSpace software copyright © 2002-2026

Licencia