Reinforcement learning (RL) is successfully applied to various engineering fields. RL is generally used for structural control cases to develop the control algorithms. On the other hand, a machine learning (ML) is adopted in various research to make automated structural design model for reinforced concrete (RC) beam members. In this case, ML models are developed to produce results that are as similar to those of training data as possible. The ML model developed in this way is difficult to produce better results than the training data. However, in reinforcement learning, an agent learns to make decisions by interacting with an environment. Therefore, the RL agent can find better design solution than the training data. In the structural design process (environment), the action of RL agent represent design variables of RC beam. Because the number of design variables of RC beam section is many, multi-agent DQN (Deep Q-Network) was used in this study to effectively find the optimal design solution. Among various versions of DQN, Double Q-Learning (DDQN) that not only improves accuracy in estimating the action-values but also improves the policy learned was used in this study. American Concrete Institute (318) was selected as the design codes for optimal structural design of RC beam and it was used to train the RL model without any hand-labeled dataset. Six agents of DDQN provides actions for beam with, beam depth, bottom rebar size, number of bottom rebar, top rebar size, and shear stirrup size, respectively. Six agents of DDQN were trained for 5,000 episodes and the performance of the multi-agent of DDQN was evaluated with 100 test design cases that is not used for training. Based on this study, it can be seen that the multi-agent RL algorithm can provide successfully structural design results of doubly reinforced beam.