In this study, the homogeneity and stability of standard samples for proficiency testing in indoor air quality within the country (formaldehyde, benzene, toluene, ethylbenzene, p-xylene, styrene, TVOC) were evaluated. The procedures and statistical analysis methods applied in ISO/IEC 13528 (2009) and KS A ISO Guide 35 (2005) were applied as evaluation methods. The homogeneity evaluation was a statistical analysis of repeated measurements of each of the 11 ports and between the 11 ports concentration data. As a result, the coefficient of variation (CV) was within the range of 1.9%~5.9%. The difference between the ports was found to be insignificant and met the statistical standard specified in KS Q ISO 13528. The stability evaluation was assessed by the change in concentration over the long-term stability of the standard samples stored for 90 days. The coefficient of variation (CV), which was within the range of 2.6%~9.0%, exhibited changes in the concentration of the long-term stored standard samples. However, the results satisfy the statistical standard specified in KS A ISO Guide 35. Overall, there is no significant difference between the homogeneity of the standard samples by the port and the stability of the long-term stored samples. Therefore, it is considered to be an appropriate method to supply standard samples in an indoor air quality proficiency test.
2011년에 전국 79개 악취검사기관을 대상으로 공기희석관능법에 대한 숙련도 시험을 실시하였다. 악취방지법의 부지 경계선과 악취 배출원의 배출허용기준을 모사한 2개의 합성복합악취를 숙련도 시험물질(proficiency testing materials, PTM)로 사용 하였다. 부지 경계선 시료는 7ppm의 톨루엔과 7ppm m-자일렌의 복합악취로 구성하였으며, 배출구 시 료는 10ppm DMS (dimethyl sulfide)와 10ppm DMDS (dimethyl disulfide)의 복합악취로 구성하였다. 숙련도 시험 결과는 기준값으로 평균과 중간값을 사용하고, 목표표준편차로 일반 표준편차, 로버스트 표준편차 및 변동계수를 사용하여 Z-점수를 평가하였다. 시험결 과의 변동계수는 PTMs의 냄새강도가 증가함에 따라 감소하였다. 복합악취에 대한 숙련 도 시험 결과는 악취희석배율보다 로그 스케일의 악취지수를 사용하여 평가하는 것이 적 절했다. 두 PTMs에 대한 참여기관의 Z-점수를 변동계수, 표준편차, 그리고 로버스트 표 준편차를 사용하여 평가할 때, 참여기관의 95%가 숙련도 기준을 만족하였다. 목표 표준 편차를 변동계수의 20%로 설정하였을 때 참여기관의 만족도 비율은 부지경계와 출구 PTM 시료에 대해 각각 90%와 95%로 양호하였다. 이러한 결과들로부터 부지경계와 출 구의 복합악취를 모사한 두 합성 PTMs 모두 복합악취의 숙련도 시험물질로 적합하였다.
Since 2011, proficiency test for the air dilution olfactory method started in Korea for the evaluation of the authorized odor inspection agencies’ analysis skill. For this purpose, sulfur compounds of PTMs (proficiency test materials) were made and investigated for the application to the proficiency test as a complex malodor sample. Time stability and homogeneity between samples were analysed for the PTM which was made with 10 ppm of DMS and 10 ppm of DMDS. As the results, the stability of sample concentration with GC analysis was shown around 6%RSD through the time of 6~48 hr. In addition, dilution number during the same test period appeared almost stable, less than 6%RSD in air dilution olfactory method. The reproducibility results of four laboratories showed very similar results except one lab which was caused by the elder panel characteristics.
This paper aims to provide guidelines on developing English language proficiency (ELP) tests based on the experience from ELP assessments in the U.S. after the implementation of No Child Left Behind (NCLB). While there might be substantial differences between the content and purpose of ELP tests developed from country to country, there are, however, areas that experts in charge of ELP test development in other countries can benefit from. The NCLB legislation in the U.S. made the assessment of English language learners (ELL) students’ level of proficiency in English mandatory once a year and provided useful guidelines for developing ELP assessments. This mandate, along with its useful guidelines, helped improve the quality of ELP assessment significantly and led to the development of several batteries of ELP assessments either through consortia of states or by test publishers in the U.S. The newly developed assessments were based on states’ ELP standards. They incorporated the concept of academic language which is an essential requirement for ELL students’ performance in the academic content areas, and were tested in extensive pilot and field studies. Some implications were drawn from such improvements for ELL assessment and accountability not only in the U.S. but in other countries including Korea.
This article focuses on analysis of 11 storytelling samples produced by Korean adult test-takers of an English speaking proficiency test. The process of telling stories, hearing the stories, and retelling them is a commonly used way of communication with others not only in a classroom but also in everyday life, and moreover storytelling is a task often used in English speaking proficiency tests. Although it is surely an important part of our lives as mentioned above, there has not been much research on structure and characteristics of storytelling and its value in English language education. It should be discussed in public that what storytelling can do for English education in Korea, more specifically in teaching English speaking. This study provides what cyclical structure of storytelling is and what it means to English language education in Korea. It is illustrated that storytelling task samples of advanced learners of English can be well understood from the framework of storytelling components (Bidell, Hubbard, & Weaver, 1997).
The deficiency of competent native English speaker raters and the inherent problem with intra-rater and inter-rater reliability of the oral proficiency interview (OPI) has precluded the full-fledged implementation of English performance testing, inevitably ushering in the computer- based oral proficiency interview (COPI) as its viable alternative with the help of automatic speech recognition (ASR). The plausibility and feasibility of implementing ASR-based COPI has recently been investigated with favorable results, which warrants more sophisticated research focusing on development of desirable test methods that will meet the rigorous criteria required by high-stakes language tests. In this respect, employing varied statistical methods as correlational, regression analyses, and ANOVA, the present study attempts to explore strengths and limitations of test method facets and to identify valid test methods to maximize the validity and reliability of ASR-based COPⅠ. Within the theoretical framework of communicative language components to be measured, the statistical findings reveal that some test methods prove to be more effective than others in producing COPI test results with better discriminability and reliability. The survey of students and teachers also suggest their favorable attitudes toward utilizing the COPI for in-class evaluation. Both findings strongly corroborates potential of the COPI in question as a valid performance testing tool to measure overall communicative competence. The current research is expected not only to shed light on advancement of performance testing, but also to serve the purpose of enhancing communicative English teaching.
Serious inherent problems with practicality, intra-rater and inter-rater reliability overshadow the known positive washback effects of performance assessment in language education. In particular, it has been welldocumented that inter-rater reliability poses a serious threat to overall test validity, since individual raters necessarily measure performance according to their own subjective severity criteria in language proficiency. However, language testing has witnessed a remarkable series of breakthroughs in performance assessment during the recent advent of the information era. One such breakthrough utilizes state-of-the-art automatic speech recognition (ASR) technology for oral proficiency interviews(OPI). Granting that current forms of ASR technologies may not produce results with the reliability needed to accommodate highstakes standardized test administration, they do offer aid in approaching the thorny issues of practicality and inherent human inter-rater subjectivity. Accordingly, this paper is intended to investigate the degree to which ASR-based OPI ratings match similar human-conducted OPI ratings by employing correlational analyses on the basis of degrees of rater severity. Furthermore, this paper attempts to explore a method of enhancing the robustness of ASR-based OPI ratings which capitalizes on suprasegmental information by measuring fluency based principally on the test-takers’ response time length.