검색결과

검색조건
좁혀보기
검색필터
결과 내 재검색

간행물

    분야

      발행연도

      -

        검색결과 2

        2.
        2023.11 구독 인증기관·개인회원 무료
        Nuclear Material Accountancy (NMA) system quantitatively evaluates whether nuclear material is diverted or not. Material balance is evaluated based on nuclear material measurements based on this system and these processes are based on statistical techniques. Therefore, it is possible to evaluate the performance based on modeling and simulation technique from the development stage. In the performance evaluation, several diversion scenarios are established, nuclear material diversion is attempted in a virtual simulation environment according to these scenarios, and the detection probability is evaluated. Therefore, one of the important things is to derive vulnerable diversion scenario in advance. However, in actual facilities, it is not easy to manually derive weak scenario because there are numerous factors that affect detection performance. In this study, reinforcement learning has been applied to automatically derive vulnerable diversion scenarios from virtual NMA system. Reinforcement learning trains agents to take optimal actions in a virtual environment, and based on this, it is possible to develop an agent that attempt to divert nuclear materials according to optimal weak scenario in the NMA system. A somewhat simple NMA system model has been considered to confirm the applicability of reinforcement learning in this study. The simple model performs 10 consecutive material balance evaluations per year and has the characteristic of increasing MUF uncertainty according to balance period. The expected vulnerable diversion scenario is a case where the amount of diverted nuclear material increases in proportion to the size of the MUF uncertainty, and total amount of diverted nuclear material was assumed to be 8 kg, which corresponds to one significant quantity of plutonium. Virtual NMA system model (environment) and a divertor (agent) attempting to divert nuclear material were modeled to apply reinforcement learning. The agent is designed to receive a negative reward if an action attempting to divert is detected by the NMA system. Reinforcement learning automatically trains the agent to receive the maximum reward, and through this, the weakest diversion scenario can be derived. As a result of the study, it was confirmed that the agent was trained to attempt to divert nuclear material in a direction with a low detection probability in this system model. Through these results, it is found that it was possible to sufficiently derive weak scenarios based on reinforcement learning. This technique considered in this study can suggest methods to derive and supplement weak diversion scenarios in NMA system in advance. However, in order to apply this technology smoothly, there are still issues to be solved, and further research will be needed in the future.