The Nuclear Export and Import Control System (NEPS) is currently in operation for nuclear export and import control. To ensure consistent and efficient control, various computational systems are either already in place or being developed. With numerous scattered systems, it becomes crucial to integrate the databases from each to maximize their utility. In order to effectively utilize these scattered computer systems, it is necessary to integrate the databases of each system and develop an associated search system that can be used for integrated databases, so we investigated and analyzed the AI language model that can be applied to the associated search system. Language Models (LM) are primarily divided into two categories: understanding and generative. Understanding Language Models aim to precisely comprehend and analyze the provided text’s meaning. They consider the text’s bidirectional context to understand its deeper implications and are used in tasks such as text classification, sentiment analysis, question answering, and named entity recognition. In contrast, Generative Language Models focus on generating new text based on the given context. They produce new textual content continuously and are beneficial for text generation, machine translation, sentence completion, and storytelling. Given that the primary purpose of our associated search system is to comprehend user sentences or queries accurately, understanding language models are deemed more suitable. Among the understanding language models, we examined BERT and its derivatives, RoBERTa and DeBERTa. BERT (Bidirectional Encoder Representations from Transformers) uses a Bidirectional Transformer Encoder to understand the sentence context and engages in pre-training by predicting ‘MASKED’ segments. RoBERTa (A Robustly Optimized BERT Pre-training Approach) enhances BERT by optimizing its training methods and data processing. Although its core architecture is similar to BERT, it incorporates improvements such as eliminating the NSP (Next Sentence Prediction) task, introducing dynamic masking techniques, and refining training data volume, methodologies, and hyperparameters. DeBERTa (Decoding-enhanced BERT with disentangled attention) introduces a disentangled attention mechanism to the BERT architecture, calculating the relative importance score between word pairs to distribute attention more effectively and improve performance. In analyzing the three models, RoBERTa and DeBERTa demonstrated superior performance compared to BERT. However, considering factors like the acquisition and processing of training data, training time, and associated costs, these superior models may require additional efforts and resources. It’s therefore crucial to select a language model by evaluating the economic implications, objectives, training strategies, performance-assessing datasets, and hardware environments. Additionally, it was noted that by fine-tuning with methods from RoBERTa or DeBERTa based on pre-trained BERT models, the training speed could be significantly improved.