The primary purposes of this study are to identify the characteristics of modeling a rater training program and to develop an efficient training model at the University of Illinois at Urbana-Champaign. For these purposes, this study proposes that a rater training program should be standardized by accomplishing innovative systematic changes that consider multiple aspects. This study utilized a modified version of Lynch’s program evaluation model (1996, 2003) to collect evidence from different sources, including data drawn from the entire evaluation process ranging from needs analysis to a feedback system based on the final product of the evaluation. Mixed methods were proposed for the data analysis. Quantitative data analysis was proposed for analyzing the surveys, and the rating corpus. Qualitative and document analysis were also essential for analyzing relevant training materials and workshop observation as well as exploring the degree of change in the perceptions of the raters. The results of this study provide educational implications for language testing. The salient value of this study is the collaboration with stakeholders in a test administration situation. Raters’ concerns and challenges were clearly identified, shared, and resolved with the practitioners.