랜덤 포레스트 기반 우울증 발현 패턴 도출
Depression is one of the most important psychiatric disorders worldwide. Most depression-related data mining and machine learning studies have been conducted to predict the presence of depression or to derive individual risk factors. However, since depression is caused by a combination of various factors, it is necessary to identify the complex relationship between the factors in order to establish effective anti-depression and management measures. In this study, we propose a methodology for identifying and interpreting patterns of depression expressions using the method of deriving random forest rules, where the random forest rule consists of the condition for the manifestation of the depressive pattern and the prediction result of depression when the condition is met. The analysis was carried out by subdividing into 4 groups in consideration of the different depressive patterns according to gender and age. Depression rules derived by the proposed methodology were validated by comparing them with the results of previous studies. Also, through the AUC comparison test, the depression diagnosis performance of the derived rules was evaluated, and it was not different from the performance of the existing PHQ-9 summing method. The significance of this study can be found in that it enabled the interpretation of the complex relationship between depressive factors beyond the existing studies that focused on prediction and deduction of major factors.