논문 상세보기

Interpretable deep learning framework for predicting cordycepin production in Cordyceps militaris cultivated on Pinus densiflora sawdust KCI 등재

  • 언어ENG
  • URLhttps://db.koreascholar.com/Article/Detail/447498
구독 기관 인증 시 무료 이용이 가능합니다. 4,800원
한국버섯학회지 (Journal of Mushrooms (J. Mushrooms))
한국버섯학회 (The Korean Society of Mushroom Science)
초록

Cordycepin is the principal bioactive compound produced by Cordyceps militaris and exhibits diverse pharmacological properties. However, cordycepin production is highly sensitive to cultivation conditions, leading to substantially variable production amounts and challenges in process optimization. An interpretable machine learning framework was established in this study to predict the cordycepin produced by C. militaris cultivated on Pinus densiflora sawdust. Three key cultivation parameters—input weight, growth weight, and particle size—were quantified using submerged mycelial culture. The cordycepin content was measured via high-performance liquid chromatography. Four predictive models (random forest, support vector machine, XGBoost, and artificial neural network) were optimized through a randomized hyperparameter search and evaluated using internal validation and Tropsha’s external quantitative structure-activity relationship criteria. The validation accuracy of XGBoost was the highest (root mean square error = 42.67 μg/mL), whereas the external performance of random forest was the most reliable (R² = 0.898). Shapley additive explanations revealed that input weight most strongly influenced cordycepin production, followed by growth weight and particle size, with distinct nonlinear and interaction-driven effects among the cultivation variables. Kernel density and dependence analyses confirmed the occurrence of multimodal production regimes associated with the substrate loading and particle size characteristics. Finally, the best-performing model was deployed through a streamlit-based graphical user interface, enabling the real-time prediction of cordycepin concentration with a 95% confidence interval. The results collectively demonstrate the utility of interpretable AI-driven modeling for unveiling complex biological responses, providing a practical decision-support tool for optimizing cordycepin production in fungal biotechnologies.

목차
ABSTRACT
INTRODUCTION
MATERIALS AND METHODS
    Preparation of submerged culture media
    Determination of mycelium dry weight
    Determination of cordycepin
    Dataset preparation
    Data preprocessing
    Model training procedures
    Hyperparameter optimization
    SHAP-based model interpretation
    Model evaluation and validation
    Graphical user interface (GUI) implementation
RESULTS AND DISCUSSION
    Descriptor screening used correlation analysis
    Dependence on individual variables
    Influence of cultivation parameters on cordycepin content
    Training dynamics and performance convergence of predictive models
    Hyperparameter optimization and model interpretation used SHAP analysis
    Impact of the descriptor on the output of the model (SHAP value)
    QSAR model validation and evaluation
    GUI development for cordycepin content
ACKNOWLEDGEMENTS
REFERENCES
저자
  • Si Young Ha(Department of Environmental Materials Science, Institute of Agriculture & Life Science, Gyeongsang National University, Jinju 52828, Republic of Korea)
  • Hyeon Cheol Kim(Department of Environmental Materials Science, Institute of Agriculture & Life Science, Gyeongsang National University, Jinju 52828, Republic of Korea)
  • Jae-Kyung Yang(Department of Environmental Materials Science, Institute of Agriculture & Life Science, Gyeongsang National University, Jinju 52828, Republic of Korea) Corresponding author