Attempt to implement A2C and PPO algorithm with modular properties of Maxout and LWTA. # UNFINISHED AND FAILED
reinforcement-learning deep-learning deep-reinforcement-learning a3c maxout-networks ppo a2c lwta modular-training modular-networks
-
Updated
Jan 11, 2018 - Python