Ciencias Exactas y Ciencias de la Salud

Permanent URI for this collectionhttps://hdl.handle.net/11285/551039

Pertenecen a esta colección Tesis y Trabajos de grado de las Maestrías correspondientes a las Escuelas de Ingeniería y Ciencias así como a Medicina y Ciencias de la Salud.

Browse

Search Results

Now showing 1 - 6 of 6

Domain-adapted pretraining and topic modeling for identifying skills categories in job postings
(Instituto Tecnológico y de Estudios Superiores de Monterrey, 2025-12-05) Madera Espíndola, Diana Patricia; Ceballos Cancino, Héctor Gibrán; Vázquez Lepe, Elisa Virginia; mtyahinojosa, emipsanchez; González Gómez, Luis José; Fahim Siddiqui, Muhammad Hammad; Cantú Ortiz, Francisco Javier; Escuela de Ingeniería y Ciencias; Campus Estado de México; Butt, Sabur
The need to identify and cluster related skills in job postings has become increasingly essential as the labor market becomes more complex, driven by the rapid growth in job market data and continuous shifts in economic conditions, technology, and skill requirements. This task is especially challenging for postings in low-resources languages such as Spanish, as there is a lack of models specifically trained to handle these language variations. Previous work in this regard involves taxonomies created by experts such as ESCO, intended to be used as reference points via measured skills. However, some issues associated with these systems stem from their reliance on region-specific taxonomies as well as their rigidity to adapt to the changing environment of the market. Thus, we proposed a method to improve skill identification performance within the Mexican automotive industry by grouping equivalent skills present in Spanish job postings through the integration of text normalization, a Domain-Adaptive Pre-training (DAPT) Spanish BERT model, the use of BERTopic for pseudo-labels extraction, the improvement of vocabulary representation via VGCN embeddings, and similarity metrics such as keyword overlap and cosine similarity for final refined clustering. The scope of this research is to evaluate our approach by using an Adjusted Rand Index (ARI) score in skill classification on a dataset exhibiting a long-tail distribution across both the head and tail data, comparing the results to those of an initial Non-DAPT model, since, to the best of our knowledge, no direct approach exists that is comparable to either our ensemble model or the distribution of our dataset.
Estimating occupancy level in indoor spaces using infrared values and environmental variables
(Instituto Tecnológico y de Estudios Superiores de Monterrey, 2024) Ovando Franco, Angelo Jean Carlo; Ceballos Cancino, Héctor Gibrán; mtyahinojosa, emimmayorquin; Dávila Delgado, Juan Manuel; Minero Re, Erik Molino; School of engineering and Sciences; Campus Monterrey; Alvarado Uribe, Joanna
Improving energy efficiency in indoor spaces is critical to reduce harmful effects of excessive energy consumption worldwide. For this reason, estimating occupancy level of people in indoor spaces has been identified as a significant contributor to improve energy efficiency and space utilization. In this thesis, in order to contribute to the solution of this problem, it is proposed to estimate occupancy level of people in enclosed spaces through an indirect approach based on environmental and infrared data, using Machine Learning (ML) techniques. The selected environmental variables are temperature, relative humidity, and atmospheric pressure. In the process, the values of five different workstations from a collaborative work area at Tecnologico de Monterrey were collected to determine the occupancy level of each workstation. To estimate occupancy, supervised ML algorithms were used, obtaining an average accuracy for each workstation of 93%, by using both environmental and infrared data, compared to ground truth counts during occupied hours. Our results show that infrared data plus environmental variables are more accurate than infrared-only sensors for estimating indoor occupancy. At the same way, Random Forest (RF) was the algorithm that reached the highest accuracy among Support Vector Machine (SVM), K-Nearest Neighbors (KNN).
Mining the SCOPUS database to identify potential academic rising stars
(Instituto Tecnológico y de Estudios Superiores de Monterrey, 2021-06-01) Ayala Urbina, Jorge Antonio; Ceballos Cancino, Héctor Gibrán; 223871; Ceballos Cancino, Héctor Gibrán; emipsanchez; Hernández Gress, Neil; Cantú Ortiz, Francisco Javier; García Vázquez, Juan Pablo; School of Engineering and Sciences; Campus Monterrey
Academic Rising Stars are often defined as authors in the earlier years of their scientific careers who have the potential to become impactful authors in the future. Universities and research institutions would benefit greatly from identifying these Academic Rising Stars and convince them to join their research teams, because if the potential of these authors is fulfilled these could benefit the institution in terms of scientific prestige and impactful scientific production. This thesis project aims to prove if it is possible to identify these Academic Rising Stars using Machine Learning classifiers and the data that is available through Elsevier’s Scopus and SciVal APIs. Conducting a case study in the field of Clustering, it was shown that it is possible to identify these authors using the average metrics from their first five years of scientific publications, with acceptable precision and accuracy. It was shown that the best attribute to label top authors is the h5-index and the classifier which can achieve the best result is the Support Vector Machine with a radial basis function kernel. The developed methodology provides a solid framework from which research institutions can identify Academic Rising Stars in the fields they are interested in.
Use of collaborative filters to recommend information in a chatbot system: Tecnologico de Monterrey Admissions Chatbot
(Instituto Tecnológico y de Estudios Superiores de Monterrey, 2021-06) Vázquez Cetina, Emmanuel; Ceballos Cancino, Héctor Gibrán; puelquio, emipsanchez; Hernández Gress, Neil; Garza Villarreal, Sara Elena; Escuela de Ingeniería y Ciencias; Campus Monterrey; Alvarado Uribe, Joanna
One of the main objectives of companies is to provide customers with a good customer service experience, so that customers are satisfied. Therefore, with the emergence of natural language processing techniques, companies are looking for automated solutions that provide quality services to customers. This is possible thanks to chatbots, which are helpful because they are permanently available and respond immediately. Additionally, with the use of recommendation systems, suggestions can be provided to the user, allowing a better conversation flow and reducing the response time. This research main objective is the development of a recommendation system for a conversational chatbot of online customer service of the ITESM admission department to suggest the following question to the user. In this project, a framework for a hybrid recommendation system is proposed, considering the user connection variables in each conversation, as user features, and applying an (Latent Dirichlet Allocation) LDA in the set of options provided by the chatbot to capture the context of the conversation as item features. In state-of-the-art, a problem similar to ours was found; this consists of recommending the following question that a user of the StackExchange platform can answer, using user characteristics and question labels to create different models. The results found that using a LightFM model, a maximum precision of 0.750 was obtained. In contrast, with our data set, a maximum precision of 0.787 is obtained, indicating that this model works well in our problem.
Occupancy Estimation in Enclosed Spaces using an Indirect Approach, laying the Foundations to Build an IoT Architecture
(Instituto Tecnológico y de Estudios Superiores de Monterrey, 2021) Vela Miam, Irving Andree; Ceballos Cancino, Héctor Gibrán; 223871; Ceballos Cancino, Héctor Gibrán; tolmquevedo, emipsanchez; Dávila Delgado, Juan Manuel; Hernandez Gress, Neil; Escuela de Ingeniería y Ciencias; Campus Monterrey; Alvarado Uribe, Joanna
The buildings industry accounts for 30% to 40% of total consumed energy worldwide, and with most of this energy coming from fossil fuels, improving energy efficiency is critical to reducing the harmful effects of this industry on the environment. Fortunately, opportune information about the number of occupants has been identified as a significant contributor to improving energy efficiency. The several works that have been carried out to solve the problem of occupancy detection/estimation fall in one of the following categories: (1) direct approaches based on sensors and cameras to measure occupancy directly, and (2) indirect approaches based on environmental data to derive the occupancy information. Due to the cost and privacy issues, indirect approaches are preferred for most use cases. This thesis focused on estimating occupancy in buildings’ indoor spaces using environmental variables andMachine Learning techniques. Specifically, the use of temperature, humidity, and pressure information was proposed to estimate the level of occupancy. Additionally, feature selection and time resolution selection steps were used to achieve high accuracy. In the process, it was necessary to generate a dataset with occupancy information from two different locations with contrasting characteristics. This dataset is an essential contribution as no other dataset suitable for estimating occupancy using the proposed environmental variables is publicly accessible.Likewise, a review of IoT platforms was carried out to identify the components required to build an occupancy estimation system. Among the contributions, it is reported that at least98% of accuracy can be achieved using this approach and a kNN model. Also, a theoretical architecture for an occupancy estimation system using AWS IoT Core was documented. Finally, the generated dataset was made publicly accessible through the Mendeley Data repository.
A Scientometric Study of the Impact of Mexican Institutions in the Period 2007-2016
(Instituto Tecnológico y de Estudios Superiores de Monterrey, 2020-05) Méndez López, Diana Carolina; Ceballos Cancino, Héctor Gibrán; 223871; Ceballos Cancino, Héctor Gibrán; emipsanchez; Garza Villarreal, Sara Elena; Juárez Ibarra, Erika Alejandra; Hernández Gress, Neil; Galeano Sánchez, Nathalíe María; Escuela de Ingeniería y Ciencias; Campus Monterrey; Cantú Ortiz, Francisco Javier
In management contexts of science, the study of the impact that different research outputs have often involves a combination of bibliometric and scientometric indicators, as well as peer review and expert opinion. Scientometric indicators are effective for a quantitative analysis as they provide a quick insight of the current and past situation at any organizational level. However, there is no indicator that fits every decision as there are many factors that can have an effect on the impact of an entity (e.g., different publication and citation patterns among disciplines or types of publications), and not all indicators are normalized to account for them. The present study, focuses on the analysis of a set of indicators and their relation with a proposed field-normalized indicator, known as the Field-Weighted Citation Impact for Mexican Institutions (FWCIMX), using a panel data model. The proposed indicator is based on the formula for the FWCI developed by Scopus; the main differences involve ignoring the document type and using data that only contemplates the citations done to Mexican documents, so it only compares Mexican institutions against other Mexican institutions. Two models are proposed: the first model analyzes the relation between the citations per region and the FWCIMX, while the second one analyzes the collaborations per region. Both models include other indicators, and have presented different results when tested in three-year time windows. This research has been performed with the intention of helping researchers and research institutions understand the relation, either positive or negative, that certain indicators have on the behavior of the FWCIMX, which has been designed to compare the impact that the most relevant Mexican institutions have, regardless of the disciplines in which each one of them is more prominent. To achieve these results, it is worth mentioning that the data used was collected from Scopus and comprises a ten-year period that goes from 2007 to 2016. Two batches of experiments were run. The first batch revealed that publishing in journals and trade journals had positive effects on the FWCIMX, so did receiving citations from Oceania, and collaborating with South America and Oceania. In contrast, indicators with negative effects on the FWCIMX include publishing in Open Access periodicals, receiving citations from Europe, and collaborating with Europe. The second batch of experiments revealed that Mexico is the main source of citations for the publications published by the institutions analyzed in this study. The results also suggested that international collaborations and publishing in journals have negative effects. This batch also revealed that publishing in periodicals that are open access have negative effects on the proposed metric. Not so surprisingly, publishing in periodicals in quartiles Q1 and Q2 have positive effects. During the computation of the metrics and the FWCIMX itself, other interesting discoveries where obtained, such as the identification of the averaged expected citations per area and year with respect to the production in Mexico, and the ranking of Mexican institutions with regard to their averaged FWCIMXs.

Ciencias Exactas y Ciencias de la Salud

Browse

Filters

Settings

Sort By

Results per page

Search Results