Tesis de maestría / master thesis

Un modelo SVM de clasificación de spam con Oracle Text para foros de publicaciones

Loading...
Thumbnail Image

Citation

View formats

Share

Bibliographic managers

Abstract

The pervasive issue of spam, characterized by unsolicited and often irrelevant or inappropriate messages infiltrating the internet, presents substantial challenges within the realm of technical forums. This research is dedicated to a thorough exploration of the complexities surrounding spam and its impact on these forums. The disruptive nature of spam, with its tendency to clutter discussions and complicate information retrieval, has prompted the implementation of stringent measures in many technical forums. Acknowledging the dynamic and evolving nature of spamming behavior, along with the constant adaptation of content and methods employed by spammers, this study addresses the inherent challenges of the lack of adaptable spam databases suitable for automated classifiers. To bridge this gap, we meticulously crafted a spam database, labeled by content moderation experts, and categorized into spam and regular posts to ensure precise classification. Through this meticulous manual labeling process, a total of 1,916 posts were accurately identified as spam. Recognizing the need for a robust spam classification solution that seamlessly integrates into Oracle Database, SQL, and Oracle APEX applications without relying on external solutions, we embarked on the development of a sophisticated machine learning classifier. Our innovative Oracle Text SVM classifier emerged as a powerful solution, showcasing an impressive average accuracy of 90 percent during the validation phase. Further experimentation illuminated the potential of Oracle Text SVM classifiers for real-time applications, emphasizing their capacity to enhance classification efficiency by fine-tuning key Oracle Text features. This research not only contributes to a deeper understanding of spam within technical forums but also introduces a practical and efficient solution for seamlessly embedding spam classifiers within widely-used Oracle platforms.

Description

https://orcid.org/0000-0002-3509-9667

Collections

Loading...

logo

El usuario tiene la obligación de utilizar los servicios y contenidos proporcionados por la Universidad, en particular, los impresos y recursos electrónicos, de conformidad con la legislación vigente y los principios de buena fe y en general usos aceptados, sin contravenir con su realización el orden público, especialmente, en el caso en que, para el adecuado desempeño de su actividad, necesita reproducir, distribuir, comunicar y/o poner a disposición, fragmentos de obras impresas o susceptibles de estar en formato analógico o digital, ya sea en soporte papel o electrónico. Ley 23/2006, de 7 de julio, por la que se modifica el texto revisado de la Ley de Propiedad Intelectual, aprobado

Licencia