Q-learning

A model-free Reinforcement learning technique. Given states and actions , the payoff is iteratively updated. From wikipedia: