Computational Psychiatry

TD Learning model simulation

Through temporal difference (TD) learning, an agent learns from future rewards and backpropagates prediction errors by updating value estimates (continually updating beliefs about future rewards as it approaches the future). It is a core concept of model-free reinforcement learning.