Enhancing single-cell and spatial transcriptomics analysis: the role of imputation and feature selection

dc.audience.educationlevelInvestigadores/Researchers
dc.audience.educationlevelEstudiantes/Students
dc.audience.educationlevelMaestros/Teachers
dc.contributor.advisorRangel Escareño, Claudia
dc.contributor.authorChacón Ramírez Denisse
dc.contributor.catalogeremipsanchez
dc.contributor.committeememberGómez Romero, Laura Lucila
dc.contributor.committeememberHernández Lemus, Enrique
dc.contributor.committeememberReséndis Antonio, Osbaldo
dc.contributor.departmentSchool of Engineering and Sciences
dc.contributor.institutionCampus Monterrey
dc.date.accepted2025-06
dc.date.accessioned2025-07-12T02:47:02Z
dc.date.embargoenddate2026-07-11
dc.date.issued2025-06
dc.descriptionhttps://orcid.org/0000-0003-1303-0834
dc.description.abstractSingle-cell RNA sequencing (scRNA-seq) and spatial transcriptomics have revolutionized our understanding of cellular heterogeneity and tissue organization. However, extracting biological insights from these technologies remains challenging due to high-dimensional, sparse, and noisy data. Two critical but understudied problems hinder robust analysis: (1) the impact of feature selection strategies on cell-type identification, and (2) the role of data imputation in integrating scRNA-seq with spatial transcriptomics. While clustering and integration methods are widely benchmarked, the influence of pre-processing decision, such as using biologically informed marker genes or imputing missing values, remains poorly understood. This thesis addresses these gaps through systematic evaluations. This thesis addresses these knowledge gaps through systematic evaluations across diverse datasets and algorithms. First, we assess how different imputation algorithms (MAGIC, DCA, scPHENIX) affect the integration of scRNA-seq with spatial transcriptomics in both ways, cell-type deconvolution and spatial transcript prediction. Using 13 paired datasets and 10 integration tools, we found that imputation’s benefits depend on the task and algorithm. The results reveal that imputation benefits are highly context-dependent rather than universally beneficial. SpaGE consistently outperformed other methods for transcript prediction regardless of imputation status, while RCTD demonstrated superior performance for cell deconvolution tasks. Notably, we observed that imputation primarily enhances magnitude estimation rather than improving spatial pattern preservation. Second, we evaluate whether marker gene-based feature selection improves scRNA-seq clustering accuracy compared to standard approaches. By benchmarking seven algorithms(Seurat, SC3, CIDR, etc.) across five pancreatic datasets, we demonstrate that performance gains are algorithm, and dataset-dependent. SC3 and TSCAN benefited from marker gene selection across multiple datasets, while SIMLR showed dramatic dataset-dependent responses,yielding superior ARI scores (greater than 0.7) in some contexts but diminished performance in others. The Segerstolpe dataset showed consistent improvements across most algorithms when using marker genes, suggesting dataset-specific characteristics strongly influence optimal feature selection strategies. Our analysis further revealed that algorithms often identify fewer clusters than reference annotations, indicating challenges in resolving fine-grained pancreatic cell type heterogeneity. The results of this thesis emphasize that pre-processing choices must align with both analytical goals and dataset characteristics to unlock the full potential of single-cell technologies. This work provides an evidence-based framework for optimizing spatial transcriptomics and scRNA-seq analysis workflows, with implications for understanding tissue architecture and cellular dynamics across diverse biological systems.
dc.description.degreeMaster of Science in Computer Science
dc.identificator230221
dc.identificator339999
dc.identificator240790
dc.identifier.citationChacón Ramírez, D. (2025). Enhancing single-cell and spatial transcriptomics analysis: the role of imputation and feature selection [Tesis maestría]. Instituto Tecnológico y de Estudios Superiores de Monterrey. Recuperado de: https://hdl.handle.net/11285/703816
dc.identifier.orcidhttps://orcid.org/0009-0004-6859-1377
dc.identifier.urihttps://hdl.handle.net/11285/703816
dc.language.isoeng
dc.publisherInstituto Tecnológico y de Estudios Superiores de Monterrey
dc.relationInstituto Tecnológico y de Estudios Superiores de Monterrey
dc.relationSecretaría de Ciencia, Humanidades, Tecnología e Innovación (SECIHTI)
dc.relation.isFormatOfacceptedVersion
dc.rightsembargoedAccess
dc.rights.embargoreasonProceso de enviar un manuscrito para su publicación
dc.rights.urihttp://creativecommons.org/licenses/by-nc-sa/4.0
dc.subject.classificationBIOLOGÍA Y QUÍMICA::CIENCIAS DE LA VIDA::BIOLOGÍA HUMANA::BIOLOGÍA MOLECULAR
dc.subject.classificationINGENIERÍA Y TECNOLOGÍA::CIENCIAS TECNOLÓGICAS::OTRAS ESPECIALIDADES TECNOLÓGICAS::OTRAS
dc.subject.classificationBIOLOGÍA Y QUÍMICA::CIENCIAS DE LA VIDA::BIOLOGÍA CELULAR::ESTRUCTURA DE LA PARED CELULAR
dc.subject.keywordSingle-cell
dc.subject.keywordScRNA-seq
dc.subject.keywordData qimputation
dc.subject.keywordSpatial transcriptomics
dc.subject.keywordFeature selection
dc.subject.keywordClustering
dc.subject.lcshTechnology
dc.subject.lcshScience
dc.titleEnhancing single-cell and spatial transcriptomics analysis: the role of imputation and feature selection
dc.typeTesis de maestría

Files

Original bundle

Now showing 1 - 4 of 4
Loading...
Thumbnail Image
Name:
ChaconRamirez_TesusMaestria_pdfa.pdf
Size:
20.76 MB
Format:
Adobe Portable Document Format
Description:
Tesis Maestría
Loading...
Thumbnail Image
Name:
ChaconRamirez_ActaGrado-pdf.pdf
Size:
1.43 MB
Format:
Adobe Portable Document Format
Description:
Acta de Grado
Loading...
Thumbnail Image
Name:
ChaconRamirez_DeclaracionAutoria_pdfa.pdf
Size:
79.08 KB
Format:
Adobe Portable Document Format
Description:
Declaración de Autoría
Loading...
Thumbnail Image
Name:
ChaconRamirez_CartaAutorizacion_pdf.pdf
Size:
399.62 KB
Format:
Adobe Portable Document Format
Description:
Carta Autorización

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.28 KB
Format:
Item-specific license agreed upon to submission
Description:
logo

El usuario tiene la obligación de utilizar los servicios y contenidos proporcionados por la Universidad, en particular, los impresos y recursos electrónicos, de conformidad con la legislación vigente y los principios de buena fe y en general usos aceptados, sin contravenir con su realización el orden público, especialmente, en el caso en que, para el adecuado desempeño de su actividad, necesita reproducir, distribuir, comunicar y/o poner a disposición, fragmentos de obras impresas o susceptibles de estar en formato analógico o digital, ya sea en soporte papel o electrónico. Ley 23/2006, de 7 de julio, por la que se modifica el texto revisado de la Ley de Propiedad Intelectual, aprobado

DSpace software copyright © 2002-2026

Licencia