Toward Reliable Ad-hoc Scientific Information Extraction: A Case Study on Two Materials Datasets

التفاصيل البيبلوغرافية
العنوان: Toward Reliable Ad-hoc Scientific Information Extraction: A Case Study on Two Materials Datasets
المؤلفون: Ghosh, Satanu, Brodnik, Neal R., Frey, Carolina, Holgate, Collin, Pollock, Tresa M., Daly, Samantha, Carton, Samuel
سنة النشر: 2024
المجموعة: Computer Science
مصطلحات موضوعية: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Information Retrieval
الوصف: We explore the ability of GPT-4 to perform ad-hoc schema based information extraction from scientific literature. We assess specifically whether it can, with a basic prompting approach, replicate two existing material science datasets, given the manuscripts from which they were originally manually extracted. We employ materials scientists to perform a detailed manual error analysis to assess where the model struggles to faithfully extract the desired information, and draw on their insights to suggest research directions to address this broadly important task.
نوع الوثيقة: Working Paper
URL الوصول: http://arxiv.org/abs/2406.05348
رقم الأكسشن: edsarx.2406.05348
قاعدة البيانات: arXiv