Rescorla–Wagner model simulation

2022/11/03

The Rescorla-Wagner model captures key aspects of classical conditioning(Pavlovian experiment). It is based on a simple linear equation that predicts the reward associated with a stimulus.

Variables

Conditioned stimulus: $ x \in \{ 0,1 \} $.

Unconditioned stimulus: $ r\in \{ 0,1 \} $.

Associative strength between $x$ and $r$: $ w\in \mathbb{R} $.

e.g. by hearing a tone $x$, how likely $w$ the animal would think of cheese $r$.

Prediction error: $ \delta = r-wx $

With learning, the prediction error will gradually approach zero, meaning there will be less and eventually no more prediction error or, say, no more surprise.

Learning process

How does the associative strength change? i.e. How does animal learn that the tone and cheese are associated?

Following the Rescorla-Wagner rule - designed to minimise $\frac{1}{2} \delta^2$, associative streghth $w$ is updated by linearly adding a prediction error, adjusted by the learning rate $\alpha$, i.e. how fast the animal learn.

$$ w \leftarrow w + \alpha(r - wx)x$$

If x is always 1(i.e. there is always cheese following a tone), simplify the above equation as

$$ w \leftarrow w + \alpha(r - w)$$

PS. The learning process designed by the R-W rule is the same as stochastic gradient ascent(SGA).

The gradient of $\frac{1}{2} \delta^2$ is

$$ grad = \frac{d}{dw} \frac{1}{2} \delta^2 = \frac{d}{dw} \frac{1}{2} (r-wx)^2 = (r-wx)x$$

Update $w$ by

$$ w \leftarrow w + \alpha \times grad $$

which is exactly

$$ w \leftarrow w + \alpha(r - wx)x $$

Simulation

# R-W Model

import numpy as np
import matplotlib.pyplot as plt

trial_ind = np.array(range(50))

x_lst = np.full(len(trial_ind), 1)
r_lst = np.full(len(trial_ind), 1)
w_lst = []
delta_lst = []

# init
w_lst.append(0)

lr = 0.1

for i in range(len(trial_ind)):
    delta = r_lst[i] - w_lst[i] * x_lst[i]
    w = w_lst[i] + lr * delta
    delta_lst.append(delta)
    w_lst.append(w)

# remove init w
w_lst.pop(0)

# plot
plt.title("predition error decreases to 0")
plt.plot(trial_ind, delta_lst)

plt.title("associative strength increases to 1")
plt.plot(trial_ind, w_lst)

d019-rl-04-img1

d019-rl-04-img2

Reference

Dayan, Peter, and Laurence F. Abbott. Theoretical neuroscience: computational and mathematical modeling of neural systems. MIT press, 2005.