Ciencias Exactas y Ciencias de la Salud

Permanent URI for this collectionhttps://hdl.handle.net/11285/551014

Pertenecen a esta colección Tesis y Trabajos de grado de los Doctorados correspondientes a las Escuelas de Ingeniería y Ciencias así como a Medicina y Ciencias de la Salud.

Browse

Search Results

Now showing 1 - 1 of 1
  • Tesis de doctorado
    Analysis and use of textual definitions through a transformer neural network model and natural language processing
    (Instituto Tecnológico y de Estudios Superiores de Monterrey, 2021-12-02) Baltazar Reyes, Germán Eduardo; BALTAZAR REYES, GERMAN EDUARDO; 852898; Ponce Cruz, Pedro; puemcuervo; McDaniel, Troy; Balderas Silva, David Christopher; Rojas Hernández, Mario; School of Engineering and Sciences; Campus Ciudad de México; López Caudana, Edgar Omar
    There is currently an information overload problem, where data is excessive, disorganized, and presented statically. These three problems are deeply related to the vocabulary used in each document since the usefulness of a document is directly related to the number of understood vocabulary. At the same time, there are multiple Machine Learning algorithms and applications that analyze the structure of written information. However, most implementations are focused on the bigger picture of text analysis, which is to understand the structure and use of complete sentences and how to create new documents as long as the originals. This problem directly affects the static presentation of data. For these past reasons, this proposal intends to evaluate the semantical similitude between a complete phrase or sentence and a single keyword, following the structure of a regular dictionary, where a descriptive sentence explains and shares the exact meaning of a single word. This model uses a GPT-2 Transformer neural network to interpret a descriptive input phrase and generate a new phrase that intends to speak about the same abstract concept, similar to a particular keyword. The validation of the generated text is in charge of a Universal Sentence Encoder network, which was finetuned for properly relating the semantical similitude between the total sum of words of a sentence and its corresponding keyword. The results demonstrated that the proposal could generate new phrases that resemble the general context of the descriptive input sentence and the ground truth keyword. At the same time, the validation of the generated text was able to assign a higher similarity score between these phrase-word pairs. Nevertheless, this process also showed that it is still needed deeper analysis to ponderate and separate the context of different pairs of textual inputs. In general, this proposal marks a new area of study for analyzing the abstract relationship of meaning between sentences and particular words and how a series of ordered vocables can be detected as similar to a single term, marking a different direction of text analysis than the one currently proposed and researched in most of the Natural Language Processing community.
En caso de no especificar algo distinto, estos materiales son compartidos bajo los siguientes términos: Atribución-No comercial-No derivadas CC BY-NC-ND http://www.creativecommons.mx/#licencias
logo

El usuario tiene la obligación de utilizar los servicios y contenidos proporcionados por la Universidad, en particular, los impresos y recursos electrónicos, de conformidad con la legislación vigente y los principios de buena fe y en general usos aceptados, sin contravenir con su realización el orden público, especialmente, en el caso en que, para el adecuado desempeño de su actividad, necesita reproducir, distribuir, comunicar y/o poner a disposición, fragmentos de obras impresas o susceptibles de estar en formato analógico o digital, ya sea en soporte papel o electrónico. Ley 23/2006, de 7 de julio, por la que se modifica el texto revisado de la Ley de Propiedad Intelectual, aprobado

DSpace software copyright © 2002-2025

Licencia