This study explores the development of English textbooks in North Korea through corpus-based analysis aimed at illuminating the differences between materials produced during the Kim Jong-il and Kim Jong-un regimes. In the context of educational reforms and changing political ideology, this study investigates BNC/COCA-based lexical coverage and the key lexical features of North Korean middle school English textbooks, highlighting the complexity, vocabulary, and readability of the learning materials. The findings revealed that the Kim Jong-un regime had implemented reforms to improve English language education, with increased lexical diversity, textual complexity, and vocabulary exposure. Although no significant differences were found between the two regimes regarding the lexical coverage of textbooks, the Kim Jong-un regime’s textbooks exhibited improvements in diversity, readability, and complexity. This study contributes to a broader understanding of the interplay between political ideology and English language education in North Korea, offering insights that have implications beyond the North Korean context and encouraging reflection on the nation-driven educational reform.
In recent years, an array of studies has focused on ‘translationese’ (i.e., unique features that manifest in translated texts, causing second language (L2) writings to be similar to translated texts but different from native language (L1) writings). This intriguing linguistic pattern has motivated scholars to investigate potential markers for predicting the divergence of L1 and L2 texts. This study builds on this work, evaluating the feature importance ranking of specific translationese markers, including standardized type-token ratio (STTR), mean sentence length, bottom-frequency words, connectives, and n-grams. A random forest model was used to compare these markers in L1 and L2 academic journal article abstracts, providing a robust quantitative analysis. We further examined the consistency of these markers across different academic disciplines. Our results indicate that bottom-frequency words are the most reliable markers across disciplines, whereas connectives show the least consistency. Interestingly, we identified three-word lexical bundles as discipline-specific markers. These findings present several implications and open new avenues for future research into translationese in L2 writing.