Evolutionary clustering using classifiers: definition, implementation, scalability, and applications

dc.audience.educationlevelPúblico en general/General publices_MX
dc.contributor.advisorGuiterrez Rodriguez, Andres Eduardo
dc.contributor.authorSainz Tinajero, Benjamin Mario
dc.contributor.catalogerpuemcuervoes_MX
dc.contributor.committeememberOrtiz Bayliss, Jose Carlos
dc.contributor.committeememberAmaya Contreras, Ivan Mauricio
dc.contributor.committeememberMedina Perez, Miguel Angel
dc.contributor.departmentSchool of Engineering and Scienceses_MX
dc.contributor.institutionCampus Estado de Méxicoes_MX
dc.creatorGUTIERREZ RODRIGUEZ, ANDRES EDUARDO; 436765
dc.date.accepted2022-04-22
dc.date.accessioned2023-05-03T18:38:18Z
dc.date.available2023-05-03T18:38:18Z
dc.date.issued2022-04-22
dc.description.abstractClustering is a Machine Learning tool for partitioning multi-dimensional data automatically into mutually exclusive groups, aiming to reflect the patterns of the phenomena it represents. Clustering algorithms perform this task conditioned by the clustering criterion modeled in its objective function. However, selecting the optimal criterion is a domain-dependent task that requires information on the cluster structure that a user often does not count on due to the unsupervised nature of the technique. Available approaches accentuate this problem as they perform clustering according to a similarity notion often limited to the concepts of compactness and connectedness, inducing bias and favoring clusters with certain shape, size, or density properties from using conventional distance functions. However, we cannot consider this a complete notion of a cluster because not every dataset will comply with both notions in the same proportion. Hence, research on this topic has not converged to a standard definition of a cluster, which raises the need for algorithms that produce adaptive solutions that mirror the underlying structures and relations within the data. This thesis is focused on the design of single-objective Evolutionary Clustering Algorithms that generate solutions that are not biased towards any cluster structure by optimizing a novel generalization clustering criterion. To achieve that, we designed objective functions modeled as a supervised learning problem, considering that a good partition should induce a well-trained classifier. That is how we decided to assess the quality of a clustering solution, according to its capability to train an ensemble of classifiers. The main contribution of this thesis is our series of Evolutionary Clustering Algorithms using Classifiers (the ECAC series), which introduces the aforementioned clustering criterion along with evolutionary computation. This meta-heuristic allows us to model distinct criteria to optimize while creating and evaluating multiple solutions along the process. The experimental results in the design of our family of methods ECAC, F1-ECAC, and ECAC-S, show an increase in similarity between the partitions created by our algorithms and the ground truth labels (obtained from the publicly available repositories where we retrieved the data) with a maximum Adjusted RAND Index of 0.96. Our second algorithm, F1-ECAC, proved the competitiveness of our contributions against traditional, single, and multi-objective Evolutionary Clustering algorithms showing no statistically significant difference against k-means, HG-means, and MOCLE. Our latest contribution, ECAC-S, was tested on a satellite image segmentation task, and it produced segmentations with higher average Adjusted RAND Index than k-means, Spectral-clustering, Birch, and DBSCAN in 4 out of 10 images.es_MX
dc.description.degreeMaster of Science in Computer Sciencees_MX
dc.format.mediumTextoes_MX
dc.identificator7||33||3304||120310es_MX
dc.identifier.citationSainz Tinajero, B. M. (2022). Evolutionary clustering using classifiers: definition, implementation, scalability, and applications [Unpublished master's thesis]. Instituto Tecnológico y de Estudios Superiores de Monterrey. Recuperado de: https://hdl.handle.net/11285/650443es_MX
dc.identifier.cvu1079078es_MX
dc.identifier.orcidhttps://orcid.org/0000-0002-1614-5066es_MX
dc.identifier.scopusid57295637600es_MX
dc.identifier.urihttps://hdl.handle.net/11285/650443
dc.language.isoenges_MX
dc.publisherInstituto Tecnológico y de Estudios Superiores de Monterreyes_MX
dc.relation.isFormatOfpublishedVersiones_MX
dc.relation.isreferencedbyREPOSITORIO NACIONAL CONACYT
dc.rightsopenAccesses_MX
dc.rights.urihttp://creativecommons.org/licenses/by/4.0es_MX
dc.subject.classificationINGENIERÍA Y TECNOLOGÍA::CIENCIAS TECNOLÓGICAS::TECNOLOGÍA DE LOS ORDENADORES::ENSEÑANZA CON AYUDA DE ORDENADORes_MX
dc.subject.keywordClusteringes_MX
dc.subject.keywordMachine Learninges_MX
dc.subject.keywordEvolutionary Clusteringes_MX
dc.subject.keywordClassifier Ensemblees_MX
dc.subject.keywordUnsupervised Learninges_MX
dc.subject.lcshTechnologyes_MX
dc.titleEvolutionary clustering using classifiers: definition, implementation, scalability, and applicationses_MX
dc.typeTesis de maestría

Files

Original bundle

Now showing 1 - 4 of 4
Loading...
Thumbnail Image
Name:
Tesis Benjamin_Mario.pdf
Size:
11.93 MB
Format:
Adobe Portable Document Format
Description:
Loading...
Thumbnail Image
Name:
CartaAutorizacionTesis-octubre 2020[51].pdf
Size:
62.83 KB
Format:
Adobe Portable Document Format
Description:
Carta Autorización
Loading...
Thumbnail Image
Name:
Autoria Benjamin_Mario.pdf
Size:
41.41 KB
Format:
Adobe Portable Document Format
Description:
Loading...
Thumbnail Image
Name:
Firmas Benjamin_Mario.pdf
Size:
371.01 KB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.3 KB
Format:
Item-specific license agreed upon to submission
Description:
logo

El usuario tiene la obligación de utilizar los servicios y contenidos proporcionados por la Universidad, en particular, los impresos y recursos electrónicos, de conformidad con la legislación vigente y los principios de buena fe y en general usos aceptados, sin contravenir con su realización el orden público, especialmente, en el caso en que, para el adecuado desempeño de su actividad, necesita reproducir, distribuir, comunicar y/o poner a disposición, fragmentos de obras impresas o susceptibles de estar en formato analógico o digital, ya sea en soporte papel o electrónico. Ley 23/2006, de 7 de julio, por la que se modifica el texto revisado de la Ley de Propiedad Intelectual, aprobado

DSpace software copyright © 2002-2026

Licencia