Capacitated Fab Scheduling Approximation using Average Reward TD(λ) Learning based on System Feature Functions
In this paper, we propose a logical control-based actor-critic algorithm as an efficient approach for the approximation of the capacitated fab scheduling problem. We apply the average reward temporal-difference learning method for estimating the relative