Hardware-aware neural architecture search for enhancing text generation

Sosa Hernández, Víctor AdriánSánchez Miranda, Israel2025-07-012025-06Sánchez Miranda, I. (2025). Hardware-aware neural architecture search for enhancing text generation [Tesis maestría]. Instituto Tecnológico y de Estudios Superiores de Monterrey. Recuperado de: https://hdl.handle.net/11285/703797https://hdl.handle.net/11285/703797https://orcid.org/0000-0002-1099-8148In recent years, neural network optimization has become critical in Natural Language Processing (NLP) tasks. However, manual tuning processes are time-consuming and heavily influenced by the designer’s prior knowledge, limiting the exploration of alternative architecture designs. Consequently, only a narrow subset of neural network architectures is typically considered for tasks such as text generation. Furthermore, neural network tuning requires specialized expertise, posing a barrier for non-experts and hindering broader innovation in the field. This research addresses these challenges by implementing a specialized Hardware-Aware Neural Architecture Search (HW-NAS) methodology, tailored specifically for text generation tasks under resource-constrained environments. The proposed NAS approach leverages a compact, efficient search space encoding key transformer architectural components, while adopting multi-objective optimization to simultaneously maximize text generation quality, measured via the METEOR score, and minimize the parameter count to enhance hardware adaptability. Two different evolutionary-based NAS strategies were explored: a custom Lexicographic Evolutionary Strategy (LexSMS-MODES) and SMS-EMOA, focusing on balancing exploration, exploitation, and computational efficiency. Experimental evaluations were conducted in both unconstrained environments and constrained hardware platforms. The optimized architectures demonstrated consistent improvements over the baseline model across multiple performance measures, including BLEU, ROUGE, and GLEU. Notably, METEOR scores showed values close to 0.72 in unconstrained settings. Although significant performance degradation was observed under constrained environments (approximately 57%–59% reduction in METEOR scores), the discovered models maintained a competitive edge when compared to several state-ofthe-art light-weight and NAS-based solutions. Hardware-aware evaluations revealed that NAS-generated models achieved substantial reductions in memory usage, GPU load, and CPU frequency deltas, despite not explicitly optimizing hardware indicators during the search. Statistical tests confirmed the stability of the discovered models across multiple hardware performance metrics. Comparisons against external works showed that while the proposed method successfully produced light-weight and efficient architectures, there remains room for improvement regarding inference latency and hardware adaptation strategies.TextoengopenAccesshttp://creativecommons.org/licenses/by-nc-sa/4.0INGENIERÍA Y TECNOLOGÍA::CIENCIAS TECNOLÓGICAS::TECNOLOGÍA DE LOS ORDENADORES::INTELIGENCIA ARTIFICIALINGENIERÍA Y TECNOLOGÍA::CIENCIAS TECNOLÓGICAS::TECNOLOGÍA DE LOS ORDENADORES::ARQUITECTURA DE ORDENADORESTechnologyScienceHardware-aware neural architecture search for enhancing text generationTesis de maestríaNeural Architecture SearchText generationMulti-objective optimizationBio-inspired algorithms1317940