The DQN (Deep Q-Network) algorithm was developed by DeepMind. It was able to solve a wide range of Atari games (some to superhuman level) by combining reinforcement learning and deep neural networks at scale
DQN overcomes unstable learning by mainly 4 techniques.
- Experience Replay
- Target Network
- Clipping Rewards
- Skipping Frames
Experience Replay:
Experience Replay is originally proposed in Reinforcement Learning for Robots Using Neural Networks in 1993. DNN is easily overfitting current episodes. Once DNN is overfitted, it’s hard to produce various experiences. To solve this problem, Experience Replay stores experiences including state transitions, rewards and actions, which are necessary data to perform Q learning, and makes mini-batches to update neural networks. This technique expects the following merits. reduces correlation between experiences in updating DNN increases learning speed with mini-batches reuses past transitions to avoid catastrophic forgetting