An Empirical Study of Instruction-tuning Large Language Models in Chinese

التفاصيل البيبلوغرافية
العنوان:	An Empirical Study of Instruction-tuning Large Language Models in Chinese
المؤلفون:	Si, Qingyi, Wang, Tong, Lin, Zheng, Zhang, Xu, Cao, Yanan, Wang, Weiping
سنة النشر:	2023
المجموعة:	Computer Science
مصطلحات موضوعية:	Computer Science - Computation and Language, Computer Science - Artificial Intelligence
الوصف:	The success of ChatGPT validates the potential of large language models (LLMs) in artificial general intelligence (AGI). Subsequently, the release of LLMs has sparked the open-source community's interest in instruction-tuning, which is deemed to accelerate ChatGPT's replication process. However, research on instruction-tuning LLMs in Chinese, the world's most spoken language, is still in its early stages. Therefore, this paper makes an in-depth empirical study of instruction-tuning LLMs in Chinese, which can serve as a cookbook that provides valuable findings for effectively customizing LLMs that can better respond to Chinese instructions. Specifically, we systematically explore the impact of LLM bases, parameter-efficient methods, instruction data types, which are the three most important elements for instruction-tuning. Besides, we also conduct experiment to study the impact of other factors, e.g., chain-of-thought data and human-value alignment. We hope that this empirical study can make a modest contribution to the open Chinese version of ChatGPT. This paper will release a powerful Chinese LLMs that is comparable to ChatGLM. The code and data are available at https://github.com/PhoebusSi/Alpaca-CoT. Comment: EMNLP 2023
نوع الوثيقة:	Working Paper
URL الوصول:	http://arxiv.org/abs/2310.07328
رقم الأكسشن:	edsarx.2310.07328
قاعدة البيانات:	arXiv

الوصف
الوصف غير متاح.