An explainable autoencoder integrating regression and classification trees for anomaly detection

dc.audience.educationlevelEstudiantes/Students
dc.contributor.advisorRaúl Monroy Borja, Raúl
dc.contributor.authorCaballero Dominguez, Zoe
dc.contributor.catalogermtyahinojosa, emipsanchez
dc.contributor.committeememberGraff Guerrero, Mario
dc.contributor.committeememberGarcía Ceja, Enrique Alejandro
dc.contributor.committeememberGonzález Mendoza, Miguel
dc.contributor.departmentEscuela de Ingeniería y Ciencias
dc.contributor.institutionCampus Estado de México
dc.contributor.mentorMedina Pérez, Miguel Angel
dc.date.accepted2025-09-05
dc.date.accessioned2025-12-14T02:24:46Z
dc.date.embargoenddate2026-12-30
dc.date.issued2025
dc.descriptionhttps://orcid.org/0000-0002-3465-995X
dc.description.abstractAnomaly detection, or outlier detection, is a critical field since anomalies are data points that deviate from normal patterns and are used to represent critical information, such as fraud, diseases, or cyber-attacks. These applications are considered high-risk scenarios which involve high-stakes decision-making. Therefore, understanding the reasoning behind machine learning models used in this area has become an essential requirement. Despite its growing importance, explainable outlier detection remains a challenge since improving model accuracy while maintaining explainability creates a significant trade-off. Furthermore, anomaly detection models are mostly designed for one type of data, either numerical or categorical. This represents a disadvantage when both data types are present in the dataset's attributes, as real-world applications often contain, since transforming categorical values to numerical ones, or vice-versa, can produce information loss and reduced performance. In this thesis, we seek to address both challenges by proposing a novel explainable semi-supervised anomaly detection model that integrates classification and regression trees into an autoencoder architecture. We named our proposal: Explainable Outlier Tree-based Encoder (EOTE). EOTE is able to detect anomalies by creating a reconstruction of the input instance based on the relationships between attributes learned from normal samples. The harder it is for EOTE to reconstruct the instance correctly, the higher the probability of being an outlier is given to the instance. We evaluate EOTE against 12 anomaly detection and one-class classifiers across 110 datasets containing attributes of one data type (numerical or nominal) and a mix of both. Our experiments reveal that EOTE is one of the top-performing algorithms at detecting outliers in datasets with only numerical and nominal attributes, as well as datasets with mixed data attributes. Therefore, without sacrificing performance, EOTE is capable of producing interpretable outputs for its classification. This combination makes EOTE a suitable classifier for anomaly detection in high-risk applications.
dc.description.degreeMaster of Science in Computer Science
dc.format.mediumTexto
dc.identificator120304||120306||120312||120903||331101
dc.identifier.cvu1157038
dc.identifier.orcidhttps://orcid.org/0009-0003-5472-1516
dc.identifier.urihttps://hdl.handle.net/11285/705236
dc.language.isoeng
dc.publisherInstituto Tecnológico y de Estudios Superiores de Monterrey
dc.relation.isFormatOfacceptedVersion
dc.rightsopenAccess
dc.rights.embargoreasonPor política las tesis de Ciencias Exactas y Ciencias de la Salud estarán en embargo por 1 año
dc.rights.urihttp://creativecommons.org/licenses/by/4.0
dc.subject.classificationINGENIERÍA Y TECNOLOGÍA::CIENCIAS TECNOLÓGICAS::TECNOLOGÍA DE LOS ORDENADORES::INTELIGENCIA ARTIFICIAL
dc.subject.classificationCIENCIAS FÍSICO MATEMÁTICAS Y CIENCIAS DE LA TIERRA::MATEMÁTICAS::ESTADÍSTICA::ANÁLISIS DE DATOS
dc.subject.classificationINGENIERÍA Y TECNOLOGÍA::CIENCIAS TECNOLÓGICAS::TECNOLOGÍA DE LOS ORDENADORES
dc.subject.keywordAnomaly detection
dc.subject.keywordOutlier detection
dc.subject.keywordOne-class classification
dc.subject.keywordExplainable artificial intelligence
dc.subject.lcshTechnology
dc.titleAn explainable autoencoder integrating regression and classification trees for anomaly detection
dc.typeTesis de maestría

Files

Original bundle

Now showing 1 - 4 of 4
Loading...
Thumbnail Image
Name:
CaballeroDominguez_TesisMaestríaOriginal.pdf
Size:
2.91 MB
Format:
Adobe Portable Document Format
Description:
Tesis Maestría Original
Loading...
Thumbnail Image
Name:
CaballeroDominguez_Cartade Autorización.pdf
Size:
92.38 KB
Format:
Adobe Portable Document Format
Description:
Carta de Autorización
Loading...
Thumbnail Image
Name:
CaballeroDominguez_HojadeFirmas.pdf
Size:
477.72 KB
Format:
Adobe Portable Document Format
Description:
Hoja de Firmas
Loading...
Thumbnail Image
Name:
CaballeroDominguezZoe_TesisMaestría.pdf
Size:
2.44 MB
Format:
Adobe Portable Document Format
Description:
Tesis Maestría

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.28 KB
Format:
Item-specific license agreed upon to submission
Description:
logo

El usuario tiene la obligación de utilizar los servicios y contenidos proporcionados por la Universidad, en particular, los impresos y recursos electrónicos, de conformidad con la legislación vigente y los principios de buena fe y en general usos aceptados, sin contravenir con su realización el orden público, especialmente, en el caso en que, para el adecuado desempeño de su actividad, necesita reproducir, distribuir, comunicar y/o poner a disposición, fragmentos de obras impresas o susceptibles de estar en formato analógico o digital, ya sea en soporte papel o electrónico. Ley 23/2006, de 7 de julio, por la que se modifica el texto revisado de la Ley de Propiedad Intelectual, aprobado

DSpace software copyright © 2002-2026

Licencia