# Rescorla–Wagner model simulation

The Rescorla-Wagner rule is based on a simple linear prediction of the reward associated with a stimulus. R-W model captures critical aspects of Pavlovian experiment(classical conditioning).

### Variables

Conditioned stimulus: $x \in \{ 0,1 \}$.

Unconditioned stimulus: $r\in \{ 0,1 \}$.

Associative strenth between $x$ and $r$: $w\in \mathbb{R}$.

e.g. by hearing a tone $x$, how likely $w$ the animal thinks of cheese $r$.

Prediction error: $\delta = r-wx$

With learning, the prediction error will gradually approach zero, meaning there will be less and eventually no more prediction error or, say, no more surprise.

### Learning process

How does the associative strength change? i.e. How does animal learn that the tone and cheese are associated?

Following the Rescorla-Wagner rule - designed to minimise $\frac{1}{2} \delta^2$, associative streghth $w$ is updated by linearly adding a prediction error, adjusted by the learning rate $\alpha$, i.e. how fast the animal learn.

$$w \leftarrow w + \alpha(r - wx)x$$

If x is always 1(i.e. there is always cheese following a tone), simplify the above equation as

$$w \leftarrow w + \alpha(r - w)$$

PS. The learning process designed by the R-W rule is the same as stochastic gradient ascent(SGA).

The gradient of $\frac{1}{2} \delta^2$ is

$$grad = \frac{d}{dw} \frac{1}{2} \delta^2 = \frac{d}{dw} \frac{1}{2} (r-wx)^2 = (r-wx)x$$

Update $w$ by

$$w \leftarrow w + \alpha \times grad$$

which is exactly

$$w \leftarrow w + \alpha(r - wx)x$$

### Simulation

# R-W Model

import numpy as np
import matplotlib.pyplot as plt

trial_ind = np.array(range(50))

x_lst = np.full(len(trial_ind), 1)
r_lst = np.full(len(trial_ind), 1)
w_lst = []
delta_lst = []

# init
w_lst.append(0)

lr = 0.1

for i in range(len(trial_ind)):
delta = r_lst[i] - w_lst[i] * x_lst[i]
w = w_lst[i] + lr * delta
delta_lst.append(delta)
w_lst.append(w)

# remove init w
w_lst.pop(0)

# plot
plt.title("predition error decreases to 0")
plt.plot(trial_ind, delta_lst)

plt.title("associative strength increases to 1")
plt.plot(trial_ind, w_lst)  ## Reference

Dayan, Peter, and Laurence F. Abbott. Theoretical neuroscience: computational and mathematical modeling of neural systems. MIT press, 2005.