Exploring Text Vectorization on the Polarity Detection of Spanish Comments
Conference Paper
Overview
Research
Identity
Additional Document Info
View All
Overview
abstract
The widespread use of the internet and social media has generated a vast amount of information on various topics, thanks to user comments and opinions. This information can be used as a subject of study for analysis and research. For example, comments about products, services, or government entities can be utilized to understand their public perception. Due to the immense volume of data generated, different techniques have been developed for automatic opinion recognition, also known as opinion mining. However, advancements in the Spanish language are still limited, leaving room for exploration in tasks such as text vectorization to determine opinion polarity through a supervised classification process. This paper presents a strategy for Spanish opinion mining to identify the sentiment expressed in comments posted on the social networking service X (formerly Twitter), classifying them according to their polarity (positive, negative, or neutral). In this sense, we propose a hybrid text vectorization approach, combining traditional vectorization techniques with grammatical POS tags. For the experiments, various vectorization techniques were evaluated as input for supervised and deep learning algorithms. The results demonstrate that hybrid vectorization improves classification performance for some cases compared to traditional vectorization models. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.
publication date
published in
Research
keywords
Deep Learning; Machine Learning; Natural Language Processing; Opinion Mining; Text Vectorization; Word Embeddings Adversarial machine learning; Contrastive Learning; Natural language processing systems; Self-supervised learning; Supervised learning; Deep learning; Embeddings; Language processing; Machine-learning; Natural language processing; Natural languages; Opinion mining; Text vectorization; Vectorization; Word embedding; Deep learning
Identity
Digital Object Identifier (DOI)
PubMed ID
Additional Document Info
start page
end page
volume
issue