LLM 임베딩 기반 질병 네트워크 분석: 대규모 언어 모델을 활용한 질병 간 연관성 탐구
This study presents a novel methodology for analyzing disease relationships from a network perspective using Large Language Model (LLM) embeddings. We constructed a disease network based on 4,489 diseases from the International Classification of Diseases (ICD-11) using OpenAI’s text-embedding-3-small model. Network analysis revealed that diseases exhibit small-world characteristics with a high clustering coefficient (0.435) and form 16 major communities. Notably, mental health-related diseases showed high centrality in the network, and a clear inverse relationship was observed between community size and internal density. The embedding-based relationship analysis revealed meaningful patterns of disease relationships, suggesting the potential of this methodology as a novel tool for studying disease associations. Results suggest that mental health conditions play a more central role in disease relationships than previously recognized, and disease communities show distinct organizational patterns. This approach shows promise as a valuable tool for exploring large-scale disease relationships and generating new research hypotheses.