This study examines how native English speaking (NES) and Korean non-native English speaking (KNES) teachers assess L2 writing performance. More specifically, this study aims to investigate whether these two groups of raters evaluate writing samples differently when using different rating scales (holistic vs. analytic) and different task types (narrative vs. argumentative). Four NES and four KNES raters evaluated 78 narrative and 78 argumentative essays written by Korean EFL university students using both holistic and analytic rating rubrics. The comparison between the two rater groups indicated that the scores given by the two groups were statistically significantly different for both holistic and analytic ratings regardless of the two task types investigated. Overall, KNES teachers rated the essays more harshly than their NES counterparts, irrespective of task type and rating scale. Multiple regression analysis of five analytic sub-criteria revealed that the two rater groups demonstrated similar patterns in assessing argumentative essays, while for narrative essays, the relative influence of each analytic sub-criterion on overall writing quality differed for the two rater groups. Implications for L2 writing assessment are included.
The ultimate goal in writing assessment, indeed in educational measurement in general, is to make sure that students are fairly evaluated, and that their scores do not depend on extraneous variables such as the raters who grade them. There has been surpri