Quantifying Interrater Agreement and Reliability Between Thoracic Pathologists: Paradoxical Behavior of Cohen’s Kappa in the Presence of a High Prevalence of the Histopathologic Feature in Lung Cancer

التفاصيل البيبلوغرافية
العنوان:	Quantifying Interrater Agreement and Reliability Between Thoracic Pathologists: Paradoxical Behavior of Cohen’s Kappa in the Presence of a High Prevalence of the Histopathologic Feature in Lung Cancer
المؤلفون:	Kay See Tan, PhD, Yi-Chen Yeh, MD, Prasad S. Adusumilli, MD, FACS, William D. Travis, MD
المصدر:	JTO Clinical and Research Reports, Vol 5, Iss 1, Pp 100618- (2024)
بيانات النشر:	Elsevier, 2024.
سنة النشر:	2024
المجموعة:	LCC:Neoplasms. Tumors. Oncology. Including cancer and carcinogens
مصطلحات موضوعية:	Interobserver coefficient, Reproducibility, Predominant histologic subtypes, Diagnostic accuracy, Performance metrics, Sensitivity and specificity, Neoplasms. Tumors. Oncology. Including cancer and carcinogens, RC254-282
الوصف:	Introduction: Cohen’s kappa is often used to quantify the agreement between two pathologists. Nevertheless, a high prevalence of the feature of interest can lead to seemingly paradoxical results, such as low Cohen’s kappa values despite high “observed agreement.” Here, we investigate Cohen’s kappa using data from histologic subtyping assessment of lung adenocarcinomas and introduce alternative measures that can overcome this “kappa paradox.” Methods: A total of 50 frozen sections from stage I lung adenocarcinomas less than or equal to 3 cm in size were independently reviewed by two pathologists to determine the absence or presence of five histologic patterns (lepidic, papillary, acinar, micropapillary, solid). For each pattern, observed agreement (proportion of cases with concordant “absent” or “present” ratings) and Cohen’s kappa were calculated, along with Gwet’s AC1. Results: The prevalence of any amount of the histologic patterns ranged from 42% (solid) to 97% (acinar). On the basis of Cohen’s kappa, there was substantial agreement for four of the five patterns (lepidic, 0.65; papillary, 0.67; micropapillary, 0.64; solid, 0.61). Acinar had the lowest Cohen’s kappa (0.43, moderate agreement), despite having the highest observed agreement (88%). In contrast, Gwet’s AC1 values were close to or higher than Cohen’s kappa across patterns (lepidic, 0.64; papillary, 0.69; micropapillary, 0.71; solid, 0.73; acinar, 0.85). The proportion of positive versus negative agreement was 93% versus 50% for acinar. Conclusions: Given the dependence of Cohen’s kappa on feature prevalence, interrater agreement studies should include complementary indices such as Gwet’s AC1 and proportions of specific agreement, especially in settings with a high prevalence of the feature of interest.
نوع الوثيقة:	article
وصف الملف:	electronic resource
اللغة:	English
تدمد:	2666-3643
Relation:	http://www.sciencedirect.com/science/article/pii/S2666364323001613; https://doaj.org/toc/2666-3643
DOI:	10.1016/j.jtocrr.2023.100618
URL الوصول:	https://doaj.org/article/3a401e2fce734f0eb5f26212206efd78
رقم الأكسشن:	edsdoj.3a401e2fce734f0eb5f26212206efd78
قاعدة البيانات:	Directory of Open Access Journals

Full Text via ClinicalKey

Full Text Finder

الوصف
تدمد:	26663643
DOI:	10.1016/j.jtocrr.2023.100618