Reinforcement Learning for optimizing a simulated production environment.
This assignment applies Reinforcement Learning (RL) to optimize a simulated production system. It uses Stable-Baselines3's PPO algorithm to train an agent in a SimPy-based production environment.