A smart tuned mass damper (TMD) is widely studied for seismic response reduction of various structures. Control algorithm is the most important factor for control performance of a smart TMD. This study used a Deep Deterministic Policy Gradient (DDPG) among reinforcement learning techniques to develop a control algorithm for a smart TMD. A magnetorheological (MR) damper was used to make the smart TMD. A single mass model with the smart TMD was employed to make a reinforcement learning environment. Time history analysis simulations of the example structure subject to artificial seismic load were performed in the reinforcement learning process. Critic of policy network and actor of value network for DDPG agent were constructed. The action of DDPG agent was selected as the command voltage sent to the MR damper. Reward for the DDPG action was calculated by using displacement and velocity responses of the main mass. Groundhook control algorithm was used as a comparative control algorithm. After 10,000 episode training of the DDPG agent model with proper hyper-parameters, the semi-active control algorithm for control of seismic responses of the example structure with the smart TMD was developed. The simulation results presented that the developed DDPG model can provide effective control algorithms for smart TMD for reduction of seismic responses.