In this paper, we propose a logical control-based actor-critic algorithm as an efficient approach for the approximation of the capacitated fab scheduling problem. We apply the average reward temporal-difference learning method for estimating the relative