دورية أكاديمية

Scoring with the Computer: Alternative Procedures for Improving the Reliability of Holistic Essay Scoring

التفاصيل البيبلوغرافية
العنوان: Scoring with the Computer: Alternative Procedures for Improving the Reliability of Holistic Essay Scoring
اللغة: English
المؤلفون: Attali, Yigal, Lewis, Will, Steier, Michael
المصدر: Language Testing. Jan 2013 30(1):125-141.
الإتاحة: SAGE Publications. 2455 Teller Road, Thousand Oaks, CA 91320. Tel: 800-818-7243; Tel: 805-499-9774; Fax: 800-583-2665; e-mail: journals@sagepub.com; Web site: http://sagepub.com
Peer Reviewed: Y
Page Count: 17
تاريخ النشر: 2013
نوع الوثيقة: Journal Articles
Reports - Evaluative
Education Level: Higher Education
Postsecondary Education
Descriptors: Scoring, Essay Tests, Reliability, High Stakes Tests, College Entrance Examinations, Scoring Rubrics, Interrater Reliability, Automation, Correlation
Assessment and Survey Identifiers: Graduate Record Examinations
DOI: 10.1177/0265532212452396
تدمد: 0265-5322
مستخلص: Automated essay scoring can produce reliable scores that are highly correlated with human scores, but is limited in its evaluation of content and other higher-order aspects of writing. The increased use of automated essay scoring in high-stakes testing underscores the need for human scoring that is focused on higher-order aspects of writing. This study experimentally evaluated several alternative procedures for eliciting distinct human scores and improving their reliability. Essays written in response to the argument and issue tasks of the Analytical Writing measure of the GRE General Test were scored by experienced raters under different conditions. Criteria for evaluation included inter-rater agreement, agreement with machine scores, and cross-task reliability. First, the use of a modified scoring rubric that focused on higher-order writing skills increased the reliability for one type of task but decreased it for another. Second, scoring in batches of similar length essays did not have any effect on scores. Third, scoring with available automated essay scores increased reliability of human scores, but also increased their similarity with automated scores. Finally, the use of a more refined 18-point scoring scale significantly increased reliability. (Contains 6 tables, 2 figures and 1 note.)
Abstractor: As Provided
Number of References: 34
Entry Date: 2013
رقم الأكسشن: EJ1005785
قاعدة البيانات: ERIC
الوصف
تدمد:0265-5322
DOI:10.1177/0265532212452396