Self-training Strategies for Sentiment Analysis: An Empirical Study

التفاصيل البيبلوغرافية
العنوان:	Self-training Strategies for Sentiment Analysis: An Empirical Study
المؤلفون:	Liu, Haochen, Rallabandi, Sai Krishna, Wu, Yijing, Dakle, Parag Pravin, Raghavan, Preethi
سنة النشر:	2023
المجموعة:	Computer Science
مصطلحات موضوعية:	Computer Science - Computation and Language
الوصف:	Sentiment analysis is a crucial task in natural language processing that involves identifying and extracting subjective sentiment from text. Self-training has recently emerged as an economical and efficient technique for developing sentiment analysis models by leveraging a small amount of labeled data and a large amount of unlabeled data. However, given a set of training data, how to utilize them to conduct self-training makes a significant difference in the final performance of the model. We refer to this methodology as the self-training strategy. In this paper, we present an empirical study of various self-training strategies for sentiment analysis. First, we investigate the influence of the self-training strategy and hyper-parameters on the performance of traditional small language models (SLMs) in various few-shot settings. Second, we also explore the feasibility of leveraging large language models (LLMs) to help self-training. We propose and empirically compare several self-training strategies with the intervention of LLMs. Extensive experiments are conducted on three real-world sentiment analysis datasets. Comment: Accepted by EACL Findings 2024
نوع الوثيقة:	Working Paper
URL الوصول:	http://arxiv.org/abs/2309.08777
رقم الأكسشن:	edsarx.2309.08777
قاعدة البيانات:	arXiv

الوصف
الوصف غير متاح.