산경연구논집 (JIDB) Vol. 11 No. 7 (p.29-40)

A Study on the Prediction Model of the Elderly Depression

키워드 :
Machine Learning,Classification,Elderly Depresiion,Random Forest,Decision Tree

목차

Abstract
1. Introduction
2. 선행연구
   2.1. 노인우울증
   2.2. 기계학습
   2.3. 의사결정나무 및 랜덤포레스트
3. 연구방법
   3.1. 데이터 분석 방법 및 절차
4. 데이터 분석 및 예측 모형 추정 결과
   4.1. 데이터 탐색
   4.2. 모형 추정 결과
5. 결론
References

초록

Purpose: In modern society, many urban problems are occurring, such as aging, hollowing out old city centers and polarization within cities. In this study, we intend to apply big data and machine learning methodologies to predict depression symptoms in the elderly population early on, thus contributing to solving the problem of elderly depression. Research design, data and methodology: Machine learning techniques used random forest and analyzed the correlation between CES-D10 and other variables, which are widely used worldwide, to estimate important variables. Dependent variables were set up as two variables that distinguish normal/depression from moderate/severe depression, and a total of 106 independent variables were included, including subjective health conditions, cognitive abilities, and daily life quality surveys, as well as the objective characteristics of the elderly as well as the subjective health, health, employment, household background, income, consumption, assets, subjective expectations, and quality of life surveys. Results: Studies have shown that satisfaction with residential areas and quality of life and cognitive ability scores have important effects in classifying elderly depression, satisfaction with living quality and economic conditions, and number of outpatient care in living areas and clinics have been important variables. In addition, the results of a random forest performance evaluation, the accuracy of classification model that classify whether elderly depression or not was 86.3%, the sensitivity 79.5%, and the specificity 93.3%. And the accuracy of classification model the degree of elderly depression was 86.1%, sensitivity 93.9% and specificity 74.7%. Conclusions: In this study, the important variables of the estimated predictive model were identified using the random forest technique and the study was conducted with a focus on the predictive performance itself. Although there are limitations in research, such as the lack of clear criteria for the classification of depression levels and the failure to reflect variables other than KLoSA data, it is expected that if additional variables are secured in the future and high-performance predictive models are estimated and utilized through various machine learning techniques, it will be able to consider ways to improve the quality of life of senior citizens through early detection of depression and thus help them make public policy decisions.