Enhancing single-cell and spatial transcriptomics analysis: the role of imputation and feature selection
| dc.audience.educationlevel | Investigadores/Researchers | |
| dc.audience.educationlevel | Estudiantes/Students | |
| dc.audience.educationlevel | Maestros/Teachers | |
| dc.contributor.advisor | Rangel Escareño, Claudia | |
| dc.contributor.author | Chacón Ramírez Denisse | |
| dc.contributor.cataloger | emipsanchez | |
| dc.contributor.committeemember | Gómez Romero, Laura Lucila | |
| dc.contributor.committeemember | Hernández Lemus, Enrique | |
| dc.contributor.committeemember | Reséndis Antonio, Osbaldo | |
| dc.contributor.department | School of Engineering and Sciences | |
| dc.contributor.institution | Campus Monterrey | |
| dc.date.accepted | 2025-06 | |
| dc.date.accessioned | 2025-07-12T02:47:02Z | |
| dc.date.embargoenddate | 2026-07-11 | |
| dc.date.issued | 2025-06 | |
| dc.description | https://orcid.org/0000-0003-1303-0834 | |
| dc.description.abstract | Single-cell RNA sequencing (scRNA-seq) and spatial transcriptomics have revolutionized our understanding of cellular heterogeneity and tissue organization. However, extracting biological insights from these technologies remains challenging due to high-dimensional, sparse, and noisy data. Two critical but understudied problems hinder robust analysis: (1) the impact of feature selection strategies on cell-type identification, and (2) the role of data imputation in integrating scRNA-seq with spatial transcriptomics. While clustering and integration methods are widely benchmarked, the influence of pre-processing decision, such as using biologically informed marker genes or imputing missing values, remains poorly understood. This thesis addresses these gaps through systematic evaluations. This thesis addresses these knowledge gaps through systematic evaluations across diverse datasets and algorithms. First, we assess how different imputation algorithms (MAGIC, DCA, scPHENIX) affect the integration of scRNA-seq with spatial transcriptomics in both ways, cell-type deconvolution and spatial transcript prediction. Using 13 paired datasets and 10 integration tools, we found that imputation’s benefits depend on the task and algorithm. The results reveal that imputation benefits are highly context-dependent rather than universally beneficial. SpaGE consistently outperformed other methods for transcript prediction regardless of imputation status, while RCTD demonstrated superior performance for cell deconvolution tasks. Notably, we observed that imputation primarily enhances magnitude estimation rather than improving spatial pattern preservation. Second, we evaluate whether marker gene-based feature selection improves scRNA-seq clustering accuracy compared to standard approaches. By benchmarking seven algorithms(Seurat, SC3, CIDR, etc.) across five pancreatic datasets, we demonstrate that performance gains are algorithm, and dataset-dependent. SC3 and TSCAN benefited from marker gene selection across multiple datasets, while SIMLR showed dramatic dataset-dependent responses,yielding superior ARI scores (greater than 0.7) in some contexts but diminished performance in others. The Segerstolpe dataset showed consistent improvements across most algorithms when using marker genes, suggesting dataset-specific characteristics strongly influence optimal feature selection strategies. Our analysis further revealed that algorithms often identify fewer clusters than reference annotations, indicating challenges in resolving fine-grained pancreatic cell type heterogeneity. The results of this thesis emphasize that pre-processing choices must align with both analytical goals and dataset characteristics to unlock the full potential of single-cell technologies. This work provides an evidence-based framework for optimizing spatial transcriptomics and scRNA-seq analysis workflows, with implications for understanding tissue architecture and cellular dynamics across diverse biological systems. | |
| dc.description.degree | Master of Science in Computer Science | |
| dc.identificator | 230221 | |
| dc.identificator | 339999 | |
| dc.identificator | 240790 | |
| dc.identifier.citation | Chacón Ramírez, D. (2025). Enhancing single-cell and spatial transcriptomics analysis: the role of imputation and feature selection [Tesis maestría]. Instituto Tecnológico y de Estudios Superiores de Monterrey. Recuperado de: https://hdl.handle.net/11285/703816 | |
| dc.identifier.orcid | https://orcid.org/0009-0004-6859-1377 | |
| dc.identifier.uri | https://hdl.handle.net/11285/703816 | |
| dc.language.iso | eng | |
| dc.publisher | Instituto Tecnológico y de Estudios Superiores de Monterrey | |
| dc.relation | Instituto Tecnológico y de Estudios Superiores de Monterrey | |
| dc.relation | Secretaría de Ciencia, Humanidades, Tecnología e Innovación (SECIHTI) | |
| dc.relation.isFormatOf | acceptedVersion | |
| dc.rights | embargoedAccess | |
| dc.rights.embargoreason | Proceso de enviar un manuscrito para su publicación | |
| dc.rights.uri | http://creativecommons.org/licenses/by-nc-sa/4.0 | |
| dc.subject.classification | BIOLOGÍA Y QUÍMICA::CIENCIAS DE LA VIDA::BIOLOGÍA HUMANA::BIOLOGÍA MOLECULAR | |
| dc.subject.classification | INGENIERÍA Y TECNOLOGÍA::CIENCIAS TECNOLÓGICAS::OTRAS ESPECIALIDADES TECNOLÓGICAS::OTRAS | |
| dc.subject.classification | BIOLOGÍA Y QUÍMICA::CIENCIAS DE LA VIDA::BIOLOGÍA CELULAR::ESTRUCTURA DE LA PARED CELULAR | |
| dc.subject.keyword | Single-cell | |
| dc.subject.keyword | ScRNA-seq | |
| dc.subject.keyword | Data qimputation | |
| dc.subject.keyword | Spatial transcriptomics | |
| dc.subject.keyword | Feature selection | |
| dc.subject.keyword | Clustering | |
| dc.subject.lcsh | Technology | |
| dc.subject.lcsh | Science | |
| dc.title | Enhancing single-cell and spatial transcriptomics analysis: the role of imputation and feature selection | |
| dc.type | Tesis de maestría |
Files
Original bundle
1 - 4 of 4
Loading...
- Name:
- ChaconRamirez_TesusMaestria_pdfa.pdf
- Size:
- 20.76 MB
- Format:
- Adobe Portable Document Format
- Description:
- Tesis Maestría
Loading...
- Name:
- ChaconRamirez_ActaGrado-pdf.pdf
- Size:
- 1.43 MB
- Format:
- Adobe Portable Document Format
- Description:
- Acta de Grado
Loading...
- Name:
- ChaconRamirez_DeclaracionAutoria_pdfa.pdf
- Size:
- 79.08 KB
- Format:
- Adobe Portable Document Format
- Description:
- Declaración de Autoría
Loading...
- Name:
- ChaconRamirez_CartaAutorizacion_pdf.pdf
- Size:
- 399.62 KB
- Format:
- Adobe Portable Document Format
- Description:
- Carta Autorización
License bundle
1 - 1 of 1
Loading...
- Name:
- license.txt
- Size:
- 1.28 KB
- Format:
- Item-specific license agreed upon to submission
- Description:

