The present study investigated students’ preferences for the types of tasks used to assess English speaking performance. It further examined whether students’ task type preferences affected their perceptions of test effectiveness. One hundred eighty-two high school students responded to a self-report questionnaire. A series of frequency analysis and paired samples t-tests were used for the analysis. The results showed that students’ most preferred task types and their least preferred ones overlapped with each other, suggesting that the task types of English-speaking performance tests used in schools are limited. The four key reasons determining students’ task type preferences were identified, including task difficulty, emotional comfort, practical value, and interest. In addition, the results indicated that students’ task type preferences could affect their perceptions of task effectiveness. Overall, the results suggest the need for developing more varied task types for English-speaking performance tests as well as helping students become familiar with English speaking performance tasks. Pedagogical implications were discussed along with study limitations.
The purpose of this study was to investigate inter- and intra- rater reliability in an interview and a computerized oral test. It was also examined whether rater characteristics influenced on their reliability and biases, and finally the scores of both tests were compared with those of the Versant test using an automated computer rating system. For the study, the data from 21 Korean university students and 18 Korean or native speakers of English raters with various characteristics were collected. Some of the main findings from the study were as follows. First, rater severity was significantly different in each test, but each rater consistently graded on both tests suggesting lower inter-rater reliability and higher intra-rater reliability. Secondly, rater severity was impacted by the rater characteristics such as mother tongue, gender, age, and major. Lastly, there existed a positive correlation among the scores of the three tests, indicating that the scores of human beings and computers are strongly related.