Every model so far has been supervised: you have labeled examples, and the network learns to map inputs to those labels. Trading is not naturally that kind of problem. A trading agent takes actions (buy, sell, hold, size a position), those actions move its profit and loss, and the consequences play out over time, with today’s trade affecting tomorrow’s opportunities through inventory, risk, and market impact. Reinforcement learning is the framework for exactly this: learning to act in an environment to maximize cumulative reward. This lesson sets up the states, actions, and rewards of RL, maps them onto trading where the reward is profit and loss, derives the value functions and the Q-learning update with worked numbers, and then spends real time on why markets are an unusually hostile environment for RL. It is the markets-focused lesson of the module and the bridge into the Financial Machine Learning module.
Table of Contents
Already have an account? Log in!