تقرير
Synth-SBDH: A Synthetic Dataset of Social and Behavioral Determinants of Health for Clinical Text
العنوان: | Synth-SBDH: A Synthetic Dataset of Social and Behavioral Determinants of Health for Clinical Text |
---|---|
المؤلفون: | Mitra, Avijit, Druhl, Emily, Goodwin, Raelene, Yu, Hong |
سنة النشر: | 2024 |
المجموعة: | Computer Science |
مصطلحات موضوعية: | Computer Science - Computation and Language |
الوصف: | Social and behavioral determinants of health (SBDH) play a crucial role in health outcomes and are frequently documented in clinical text. Automatically extracting SBDH information from clinical text relies on publicly available good-quality datasets. However, existing SBDH datasets exhibit substantial limitations in their availability and coverage. In this study, we introduce Synth-SBDH, a novel synthetic dataset with detailed SBDH annotations, encompassing status, temporal information, and rationale across 15 SBDH categories. We showcase the utility of Synth-SBDH on three tasks using real-world clinical datasets from two distinct hospital settings, highlighting its versatility, generalizability, and distillation capabilities. Models trained on Synth-SBDH consistently outperform counterparts with no Synth-SBDH training, achieving up to 62.5% macro-F improvements. Additionally, Synth-SBDH proves effective for rare SBDH categories and under-resource constraints. Human evaluation demonstrates a Human-LLM alignment of 71.06% and uncovers areas for future refinements. Comment: Github: https://github.com/avipartho/Synth-SBDH |
نوع الوثيقة: | Working Paper |
URL الوصول: | http://arxiv.org/abs/2406.06056 |
رقم الأكسشن: | edsarx.2406.06056 |
قاعدة البيانات: | arXiv |
الوصف غير متاح. |