التفاصيل البيبلوغرافية
العنوان:
Foundational Challenges in Assuring Alignment and Safety of Large Language Models
المؤلفون:
Anwar, Usman , Saparov, Abulhair , Rando, Javier , Paleka, Daniel , Turpin, Miles , Hase, Peter , Lubana, Ekdeep Singh , Jenner, Erik , Casper, Stephen , Sourbut, Oliver , Edelman, Benjamin L. , Zhang, Zhaowei , Günther, Mario , Korinek, Anton , Hernandez-Orallo, Jose , Hammond, Lewis , Bigelow, Eric , Pan, Alexander , Langosco, Lauro , Korbak, Tomasz , Zhang, Heidi , Zhong, Ruiqi , hÉigeartaigh, Seán Ó , Recchia, Gabriel , Corsi, Giulio , Chan, Alan , Anderljung, Markus , Edwards, Lilian , Bengio, Yoshua , Chen, Danqi , Albanie, Samuel , Maharaj, Tegan , Foerster, Jakob , Tramer, Florian , He, He , Kasirzadeh, Atoosa , Choi, Yejin , Krueger, David
سنة النشر:
2024
المجموعة:
Computer Science
مصطلحات موضوعية:
Computer Science - Machine Learning , Computer Science - Artificial Intelligence , Computer Science - Computation and Language , Computer Science - Computers and Society
الوصف:
This work identifies 18 foundational challenges in assuring the alignment and safety of large language models (LLMs). These challenges are organized into three different categories: scientific understanding of LLMs, development and deployment methods, and sociotechnical challenges. Based on the identified challenges, we pose $200+$ concrete research questions.
نوع الوثيقة:
Working Paper
URL الوصول:
http://arxiv.org/abs/2404.09932
رقم الأكسشن:
edsarx.2404.09932
قاعدة البيانات:
arXiv