Ciencias Exactas y Ciencias de la Salud
Permanent URI for this collectionhttps://hdl.handle.net/11285/551014
Pertenecen a esta colección Tesis y Trabajos de grado de los Doctorados correspondientes a las Escuelas de Ingeniería y Ciencias así como a Medicina y Ciencias de la Salud.
Browse
Search Results
- Development of biosensor-based diagnostic systems for breast cancer using biorecognition engineering techniques and machine learning approaches for biomarker discovery(Instituto Tecnológico y de Estudios Superiores de Monterrey, 2025-07-25) Mayoral Peña, Kalaumari; De Donato Capote, Marcos; emipsanchez; Artzi, Natalie; Víctor Manuel Treviño Alvarado; Alfaro Ponce, Mariel; School of Engineering and Sciences; Campus Monterrey; González Peña, Omar IsraelCancer is the second cause of mortality worldwide, while breast cancer is the second leading cause of global female mortality. Diagnosing and treating breast cancer patients at early stages is relevant for successful treatment and increasing the patient's survival rate. However, early diagnosis of this complex disease is challenging, especially in populations with limited healthcare services. As a result, developing more accessible and accurate diagnostic tools is necessary. The development of low-cost biosensor technologies that have been relevant in the last two decades, but these technologies are still in the process of reaching maturity. For these reasons, we decided to study two promising technologies that can be implemented in cancer biosensor development: 1) biorecognition engineering techniques; 2) machine learning approaches for biomarker discovery. The first technology comprises alternative techniques to generate molecules and molecule-based scaffolds with similar properties to those presented by antibodies. In this study, we presented a systematic analysis of the scientific peer-reviewed literature in the Web of Science from the last two decades to present the fundamentals of this technology and address questions about how it has been implemented in biosensors for cancer detection. The three techniques analyzed were molecularly imprinted polymers, recombinant antibodies, and antibody mimetic molecules. The PRISMA methodology included 131 scientific from 2019 to 2021 for further analysis. The results showed that antibody mimetic molecules technology was the biorecognition technology with the highest number of reports. The most studied cancer types were: multiple, breast, leukemia, colorectal, and lung. Electrochemical and optical detection methods were the most frequently used. Finally, the most analyzed biomarkers and cancer entities in the studies were carcinoembryonic antigen, MCF-7 cells, and exosomes. For the second technology, we developed a novel bioinformatics pipeline that uses machine learning algorithms (MLAs) to identify genetic biomarkers for classifying breast cancer into non-malignant, non-triple-negative, and triple-negative categories. Five Gene Selection Approaches (GSAs) were employed: LASSO (Least Absolute Shrinkage and Selection Operator), Membrane LASSO, Surfaceome LASSO, Network Analysis, and Feature Importance Score (FIS). We implemented three factorial designs to assess the impact of MLAs and GSAs on classification performance (F1 Macro and Accuracy) in both cell lines and patient samples. Using Recursive Feature Elimination (RFE) and Genetic Algorithms (GAs) in the first four GSAs, we reduced the gene count to eight per GSA while maintaining an F1 Macro ≥ 80%. Consequently, 95.5% of our treatments with these gene sets achieved an F1 Macro or Accuracy ranging from 70.3% to 97.2%. As a result, 37 different genes were obtained. We analyzed the 37 genes for their predictive power in terms of five-year survival and relapse-free survival and compared them with genes from four commercial panels. Notably, thirteen genes (MFSD2A, TMEM74, SFRP1, UBXN10, CACNA1H, ERBB2, SIDT1, TMEM129, MME, FLRT2, CA12, ESR1, and TBC1D9) showed significant predictive capabilities for up to five years of survival. TBC1D9, UBXN10, SFRP1, and MME were significant for relapse-free survival after five years. The FOXC1, MLPH, FOXA1, ESR1, ERBB2, and SFRP1 genes also matched those described in commercial panels. The influence of MLA on F1 Macro and Accuracy was not statistically significant. Altogether, the genetic biomarkers identified in this study hold potential for use in biosensors aimed at breast cancer diagnosis and treatment. We concluded that both technologies had demonstrated their utility in cancer biosensor development for vulnerable populations with limited access to healthcare. However, further studies are required, and a long road exists to establish a commercial biosensor. For this reason, we generated a research proposal to develop a biosensor integrating this study's information in an optical and electrochemical sensing platform. Also, some designs of this biosensor and preliminary results are presented.
- Real-time armed individual detection in video surveillance usingdeep learning and heuristic approaches(Instituto Tecnológico y de Estudios Superiores de Monterrey, 2024-12) Amado Garfias, Alonso Javier; Conant Pablos, Santiago Enrique; emipsanchez; Ortiz bayliss, José Carlos; Tarashima Marín Hugo; Gutiérrez Rodríguez, Andrés Eduardo; School of Engineering and Sciences; Campus MonterreyThis researchaimstoenhancetheautomaticidentificationofarmedindividualsinvideo surveillanceinreal-time.Theproposedmethodologyinvolvesthedevelopmentofalgorithms specifically designedforthedetectionofindividualscarryinghandguns,whichincludepistols and revolvers.Toachievethis,theYOLOv4modelhasbeenselectedtodetectindividuals, handguns, andfaces.Subsequently,real-timeinformationisextractedfromtheYOLOmodel, including boundingboxcoordinates,distances,andintersectionareasbetweenhandgunsand individualswithineachvideoframe.Thisinformationfeedsourheuristicsanddifferentma- chine learning(ML)proposed,facilitatingtherecognitionofarmedindividuals.Severalchal- lenges mustbeaddressed,suchasocclusion,concealedguns,andproximityofindividualsto one another.Itencouragesthedevelopmentandcomparisonofdifferenttypesofsolutions. Theyaremadeupofthreeheuristics,seven-armedpeopledetectors(APD),and44APDto use ineachvideoframe(APD4F). The heuristicsaretheDeterministicMethodofCenters(DMC),theDeterministicMethod of Distances(DMD),andtheDeterministicMethodofIntersections(DMI).Furthermore, the APDmodelsareRandomForestClassifier(RFC-APD),MultilayerPerceptron(MLP- APD), k-Nearest-Neighbors(KNN-APD),SupportVectorMachine(SVM-APD),Logistic Regression(LR-APD),NaiveBayes(NB-APD),andGradientBoostingClassifier(GBC- APD). Thereby,IproposetocreateselectorsfordecidingwhichAPDtouseineachvideo frame (APD4F)toimprovethedetectionresults.Besides,weimplementedtwotypesof APD4Fs, onebasedonaRandomForestClassifier(RFC-APD4F)andanotherinaMultilayer Perceptron (MLP-APD4F).Wedeveloped44APD4FscombiningsubsetsofsixAPDs.The most ofAPD4FoutperformedoftheindependentuseofallAPDs.Amultilayerperceptron- based APD4F,whichcombinesanMLP-APD,aNB-APD,andaLR-APD,presentedthebest performance, achievinganaccuracyof95.84%,arecallof99.28%andanF1scoreof96.07%. This researchalsoproposesasolutiontooptimizetheproblemofdetectingarmedpeople when theweaponisnotvisible.Therefore,weapplyrecurrentneuralnetworks,suchasLong Short TermMemory(LSTM),topredictthecoordinatesoftheguns.Inthisway,itispossible to haveapredictionofarmedpeopleatalltimes.ThemeasurementbetweentheYOLO handgun detectionboundingboxesandtheLSTMpredictionresultedinanIoUof65.23%. When thefirearmdetectionbytheobjectdetectorisinterrupted,theweapon’spositionis generated bytheLSTMmodelsthat,togetherwiththeAPDs,identifythearmedpeople. When theLSTMsdeliveredtheirpredictionstotheAPDs,theNB-APDdemonstratedthe best performance,achievinganaccuracyof80.93%.TheLSTMsallowedtheanalysisof 5,288 recordsofthetestvideothatcouldnotbeanalyzedbeforeduetothelackofknowledge of thegun’sposition.
- Automatic detection of mental health disorders in social media(Instituto Tecnológico y de Estudios Superiores de Monterrey, 2024-06-12) Villa Pérez, Miryam Elizabeth; Trejo Rodríguez, Luis Ángel; emipsanchez; González Mendoza, Miguel; Brena Pinero, Ramón Felipe; Moctezuma Ochoa, Daniela; Villaseñor Pineda, Luis; School of Engineering and Sciences; Campus Estado de MéxicoWith the rise of social media, these platforms have emerged as a crucial source of information for studying people's thoughts and behaviors. By using natural language processing and machine learning techniques, prior studies have explored the language of users living with different mental health conditions. However, these efforts have focused on analyzing conditions in isolation, particularly depression, and have relied on English-language data. The goal of this study is to examine the communications of English- and Spanish-speaking Twitter users through traditional and deep learning algorithms to automatically recognize whether they live with one of nine mental health conditions. To achieve that, we created two datasets in English and Spanish. The “diagnosed” set comprises the timeline of 1,500 users who explicitly reported in one or more of their posts having been diagnosed with one of the following: ADHD, Anxiety, Autism, Bipolar, Depression, Eating disorders, OCD, PTSD, and Schizophrenia. The “control” set comprises the timeline of 1,700 randomly selected users who had not disclosed a diagnosis. We extracted a variety of text features from the collected data, such as n-grams, q-grams, Part-of-speech (POS) tags, topic modeling, Linguistic Inquiry and Word Count (LIWC), and word embeddings, and trained traditional machine learning and deep learning classifiers for two tasks: binary classification, to distinguish between diagnosed and non-diagnosed users, and multiclass classification, to identify the specific diagnosis. The performance of the models was analyzed using 5-fold cross-validation, four different classification metrics (AUC, F1-score, Precision, and Recall), and the Friedman non-parametric test with the Finner post-hoc procedure. Overall, XGBoost and CNN performed the best in the two classification tasks. Employing our collected datasets, in binary classification, we achieved an AUC of 0.835 on the Spanish Twitter dataset using n-grams of words from one to three (UBT) and 0.846 on the English Twitter dataset with a 5-gram characters (C5) model. In multiclass classification, we obtained an AUC of 0.747 and 0.697 in the Spanish and English Twitter datasets, respectively. In the second phase of our research, we introduced a model named BiLEMD for the multiclass classification of mental disorders. Our approach adopts a hierarchical detection strategy, where each base model within our framework leverages diverse textual features. We aim to emulate, to some extent, the step-by-step approach employed in human clinical diagnostics. In clinical practice, professionals first determine the presence or absence of a condition before proceeding to specify its type. Although BiLEMD achieved the highest ranking in both the Spanish and English Twitter datasets, statistical significance differences were not observed. Nevertheless, additional analysis revealed that ensembles, including BiLEMD and Stacking, reduce misclassification within the control class. Moreover, BiLEMD exhibits slightly superior performance in terms of AUC and Recall compared to other classifiers. The development of computer-based methods for recognizing and classifying social media user profiles related to different mental health conditions could enhance the performance of applications aimed at early diagnosis and timely treatment.
- Security automation in software defined networks(Instituto Tecnológico y de Estudios Superiores de Monterrey, 2023-06-01) Yungaicela Naula, Noé Marcelo; YUNGAICELA NAULA, NOE MARCELO; 781291; Vargas Rosales, César; puemcuervo, emipsanchez; Zareei, Mahdi; Ramírez Velarde, Raúl Valente; Rodríguez Cruz, José Ramón; School of Engineering and Sciences; Campus Monterrey; Pérez Díaz, Jesús ArturoThe exponential increase of devices connected to the internet, and the conventional networking operation, based on distributed and static network management, have made networking an incredibly complex task. Software-Defined Networking (SDN) solves the problems arising from the static nature of conventional networking by introducing dynamism to the networking operation. SDN separates the data plane and control plane, centralizes the network control, and automates the network management. In particular, SDN technology is an effective solution to provide security to different network environments. This study solves the security problem in SDN-based networks using state-of-the-art artificial intelligent (AI) techniques. An automated security framework is proposed which integrates two components: 1) Reactive, and 2) Proactive parts. The reactive component uses Deep Learning (DL) to identify complex DDoS threats and Reinforcement Learning (RL) to mitigate them. The proactive component leverages Network Function Virtualization (NFV) to provide scalability to the proposed security framework. Extensive experiments using datasets, simulations, and physical deployments demonstrate the effectiveness of the proposed security automation framework.
- A novel functional tree for class imbalance problems(Instituto Tecnológico y de Estudios Superiores de Monterrey, 2022-11) Cañete Sifuentes, Leonardo Mauricio; CAÑETE SIFUENTES, LEONARDO MAURICIO; 787723; Monroy Borja, Raúl; puemcuervo, emipsanchez; Morales Manzanares, Eduardo; Gutiérrez Rodríguez, Andrés Eduardo; Cantú Ortiz, Francisco; Conant Pablos, Santiago; School of Engineering and Sciences; Campus Estado de México; Medina Pérez, Miguel AngelDecision trees (DTs) are popular classifiers partly because they provide models that are easy to explain and because they show remarkable performance. To improve the classification performance of individual DTs, researchers have used linear combinations of features in inner nodes (Multivariate Decision Trees), leaf nodes (Model Trees), or both (Functional Trees). Our general objective is to develop a DT using linear feature combinations that outperforms the rest of such DTs in terms of classification performance as measured by the Area Under the ROC Curve (AUC), particularly in class imbalance problems, where one of the classes in the database has few objects compared to another class. We establish that, in terms of classification performance, there exists a hierarchy, where Functional Trees (FTs) surpass Model Trees, that in turn surpass Multivariate Decision Trees. Having shown that Gama's FT, the only FT to date, has the best classification performance, we identify limitations that hinder its classification performance. To improve the classification performance of FTs, we introduce the Functional Tree for class imbalance problems (FT4cip), which takes care in each design decision to improve AUC. The decision of what pruning method to use led us to the design of the AUC-optimizing Cost-Complexity pruning algorithm, a novel pruning algorithm that does not degrade classification performance in class imbalance problems because it optimizes AUC. We show how each design decision taken when building FT4cip contributes to classification performance or to simple tree models. We demonstrate through a set of tests that FT4cip outperforms Gama's FT and excels in class imbalance problems. All our results are supported by a thorough experimental comparison in 110 databases using Bayesian statistical tests.
- Computer aided molecular design coupled with molecular dynamics and deep learning techniques as a novel approach to design new compounds(Instituto Tecnológico y de Estudios Superiores de Monterrey, 2022-05) Valencia Márquez, Darinel; Flores Tlacuahuac, Antonio; emimmayorquin; García Cuéllar, Alejandro Javier; Santibáñez Aguilar, José Ezequiel; Aguirre Soto, Héctor Alán; Gutiérrez Limón, Miguel Ángel; Escuela de Ingeniería y Ciencias; Campus MonterreyIn this work is proposed a novel approach to develop new compounds, through computer aided molecular design (CAMD) using group contribution methodologies for estimation of properties, follow of machine learning or molecular simulations techniques for estimation and validation of the target properties of the design; finally, Monte Carlo simulations is performed for validation of the thermodynamical stability of the molecule designed. This thesis is divided into three main sections, first a case study of the machine learning approach used in this work for properties predictions, follow of two case studies of molecular design coupled where the initial design is obtained by computer aided molecular design (CAMD) follows of the validation with molecular dynamics or machine learning techniques. In the first case a neural network analysis was performed to improve the predictions of 15 properties. Results of input analysis shown that reduced coulomb matrix saves CPU time and memory compared with full coulomb matrix with a similar accuracy. On the other hand, the neural network architecture exhibits the importance of the activation function, and the number of hidden layers in the neural network. Second case present the methodology to design a lubricant from scratch, using CAMD with group contribution methodologies to make a design of an Ionic Liquid-based lubricant with the purpose of use as automotive lubricant; then a molecular dynamics simulation was performed to validate the value of the design properties, finally a Monte Carlo simulation was performed to observe the thermodynamic feasibility of the molecule design. The result shows a feasible molecule previously designed where experimental values are similar with estimations of group contribution and molecular dynamics. In the third case, a classical CAMD methodology is used to make a list of molecule designs for photocatalytic applications, where the target property is the maximum absorption wave length (lmax); on the other hand, deep learning techniques is applied to predict the gap between HUMO and LUMO energies for the estimation of lmax showing a shorter list of candidates to experimental test, in the same way the NN used in this study can extend to other optical or photocatalytic applications, because HUMO and LUMO energies rules most of the photochemical properties. Finally, the mixture design is presented; however, the validations is left for future work.
- Expansion of a novel bio-inspired supervised machine learning class applied to financial forecasting(Instituto Tecnológico y de Estudios Superiores de Monterrey, 2020-02) González Núñez, Enrique; Trejo Rodríguez, Luis Ángel; emipsanchez; Hervert Escobar, Laura; Capote Sanchez, Alfredo Alberto Ramon; Ponce Espinosa, Hiram Eredín; Kampouridis, Michael; Escuela de Ingeniería y Ciencias; Campus Estado de MéxicoThe aim presented in this research consists of applying the Artificial Organic Networks (AON), a nature-inspired, supervised, metaheuristic, machine learning framework, toward the defini- tion of a new algorithm based on this machine learning class, capable of employing it for computational finance purposes, specifically, for the modeling and prediction of a stock mar- ket, based on the Index Tracking Problem (ITP). The relevance of computational finance is discussed, pointing out that is an area that has developed significantly in the last decades with different applications, some of these are: rich portfolio optimization, index-tracking, credit risk, stock investment, among others. Specifically, the Index Tracking Problem (ITP) con- cerns the prediction of stock market prices, being this a complex problem of the kind NP-hard. In this regard, this work discusses the innovative approach to implement the AON method to tackle the ITP; thus, the concept of Artificial Halocarbon Compounds or AHC-algorithm is introduced as a supervised machine learning algorithm, and as a new topology based on the AON framework. Through the discussion we review some of the disadvantages that the origi- nal Artificial Hydrocarbon Networks (AHN) topology has; these disadvantages are considered regarding the definition of the new AHC-algorithm. The proficiencies of the AHC model ca- pabilities are measured by modeling the IPC Mexico stock market index, with the aid of other economic indicators, having obtained very promising results, with a computed R-square of 0.9919, and an 8e-4 mean relative error for the forecast; and as a main contribution, the new model is an adaptable, dynamic, and reconfigurable topology, that can be applied to different approaches or systems that require simulation analysis using time series.

