This paper aims to provide guidelines on developing English language proficiency (ELP) tests based on the experience from ELP assessments in the U.S. after the implementation of No Child Left Behind (NCLB). While there might be substantial differences between the content and purpose of ELP tests developed from country to country, there are, however, areas that experts in charge of ELP test development in other countries can benefit from. The NCLB legislation in the U.S. made the assessment of English language learners (ELL) students’ level of proficiency in English mandatory once a year and provided useful guidelines for developing ELP assessments. This mandate, along with its useful guidelines, helped improve the quality of ELP assessment significantly and led to the development of several batteries of ELP assessments either through consortia of states or by test publishers in the U.S. The newly developed assessments were based on states’ ELP standards. They incorporated the concept of academic language which is an essential requirement for ELL students’ performance in the academic content areas, and were tested in extensive pilot and field studies. Some implications were drawn from such improvements for ELL assessment and accountability not only in the U.S. but in other countries including Korea.
Because performance assessment such as a composition test introduces a range of factors that may influence the chances of success for a candidate on the test, those in charge of monitoring quality control for performance assessment programs need to gather information that will help them determine whether all aspects of the programs are working as intended. In the present study, generalizability theory (Brennan, 1992) was employed to examine the relative effects of various sources of variability on students" performance on an essay writing test and also to investigate the reliability of the assigned scores. The results showed that due to the largest effect associated with the students" writing ability and negligible effects associated with facets of measurement such as scoring criteria and ratings, the generalizability coefficient estimated for the writing test was high, suggesting that the test is a reliable measure of what it purports to measure, the students" writing ability.
The Computerized Enhanced ESL Placement Test (CEEPT) at a public university in the US reflects a new academic writing assessment as test-takers are given sufficient time to plan, produce, and revise a short, academic essay. This study examines the authenticity of the CEEPT, will illuminate the potential of the computerized process-oriented writing assessment. The authenticity was examined based on both logical and empirical analyses. A close examination of a checklist with test and Target Language Use (TLU) tasks reveals relatively good correspondence between the characteristics of CEEPT tasks and of TLU tasks, which indicates high authenticity of the CEEPT. Test takers’ responses to the open- and closed-items on the CEEPT survey also show positive evidence in support of the authenticity of the CEEPT. Students perceived a close match between the academic tasks and the CEEPT tasks, and this high authenticity contributed to eliciting their true writing abilities. The CEEPT as one possible model for process-oriented writing assessment can provide alternatives to timed-single draft essay tests. The findings in this study can advance our understanding of writing assessments and may be applicable to the Korean context.
A field-specific essay test was developed as an attempt to improve the ESL placement procedure for international graduate students at the University of Illinois at Urbana-Champaign (UIUC). Graduate departments were classified into four areas, business, humanities/social sciences, technology and life sciences, and a set of four input prompts, and writing questions was developed. A total of 124 volunteers participated in taking both the regularly-required general-topic test and the field-specific test. A total-group FACETS analysis of the students’ performance on the two tests showed that they performed better on the field-specific test. However, subgroup analyses showed the field-specific topic effect only in the business and life sciences subgroups, while no prompt effect was found for the humanities/social sciences and technology subgroups. Considering that early in the test development procedure, these results were predicted by in a prompt evaluation session, the results suggest that more effort should be exerted to carefully select the topic and content of prompts in order to secure equivalency of the topic effect across all disciplinary groups. This paper further addresses limitations and promising research directions.
The present study aims at reviewing the current thinking in the field of language testing and assessment, discussing limitations in language testing in Korean academia, and suggesting future directions of the research agenda in the language testing field. The findings of the study indicated that research on language testing in Korean academia for the last ten years focused on test washback and the two high stakes English tests, which are the College Scholastic Aptitude Test (CSAT) and the newly developed National English Ability Test (NEAT). In particular, studies on the CSAT were very limited to item analysis on the CSAT and related teaching strategies despite its symbolic significance in the Korean testing field. Research issues on the NEAT, however, seem to be conducted on a broader spectrum regardless when it should be implemented is still unclear with many political and social complications. Based on the review, the present study suggested that the research agenda of future studies needs to be focused on the following areas: Validation of the testing, non-credential tests of English, classroom-based assessment, a critical perspective on the uses of English testing, and sociocultural perspectives of testing in Korean society.