Teaching Specific Scientific Knowledge into Large Language Models through Additional Training

التفاصيل البيبلوغرافية
العنوان: Teaching Specific Scientific Knowledge into Large Language Models through Additional Training
المؤلفون: Hatakeyama-Sato, Kan, Igarashi, Yasuhiko, Katakami, Shun, Nabae, Yuta, Hayakawa, Teruaki
سنة النشر: 2023
المجموعة: Computer Science
مصطلحات موضوعية: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
الوصف: Through additional training, we explore embedding specialized scientific knowledge into the Llama 2 Large Language Model (LLM). Key findings reveal that effective knowledge integration requires reading texts from multiple perspectives, especially in instructional formats. We utilize text augmentation to tackle the scarcity of specialized texts, including style conversions and translations. Hyperparameter optimization proves crucial, with different size models (7b, 13b, and 70b) reasonably undergoing additional training. Validating our methods, we construct a dataset of 65,000 scientific papers. Although we have succeeded in partially embedding knowledge, the study highlights the complexities and limitations of incorporating specialized information into LLMs, suggesting areas for further improvement.
Comment: added token information for some texts, and fixed typo
نوع الوثيقة: Working Paper
URL الوصول: http://arxiv.org/abs/2312.03360
رقم الأكسشن: edsarx.2312.03360
قاعدة البيانات: arXiv