An automated material handling system (AMHS) has been emerging as an important factor in the semiconductor wafer manufacturing industry. In general, an automated guided vehicle (AGV) in the Fab’s AMHS travels hundreds of miles on guided paths to transport a lot through hundreds of operations. The AMHS aims to transfer wafers while ensuring a short delivery time and high operational reliability. Many linear and analytic approaches have evaluated and improved the performance of the AMHS under a deterministic environment. However, the analytic approaches cannot consider a non-linear, non-convex, and black-box performance measurement of the AMHS owing to the AMHS’s complexity and uncertainty. Unexpected vehicle congestion increases the delivery time and deteriorates the Fab’s production efficiency. In this study, we propose a Q-Learning based dynamic routing algorithm considering vehicle congestion to reduce the delivery time. The proposed algorithm captures time-variant vehicle traffic and decreases vehicle congestion. Through simulation experiments, we confirm that the proposed algorithm finds an efficient path for the vehicles compared to benchmark algorithms with a reduced mean and decreased standard deviation of the delivery time in the Fab’s AMHS.