Hebbian Learning

Learning

In a neural network, both unsupervised "learning" and supervised "training" correspond to modification of the synaptic weight matrix W. A Hopfield network (and many other neural networks) are "trained" using the Hebbian Learning rule. The Hebbian learning rule is not physical, but instead comes from the pioneering work of Cognitive Psychologist Donald Hebb, who surmised that learning takes place by "reinforcing connections" between two learned states.

Mathematically, Hebbian learning means that the change in the current $\omega_{ij}$ synaptic weight is proportional both to the neuron i that originates the action potential and the neuron j that receives the signal. Suppose that the "column vector" [ p₁,p₂,...,p_n]^t is the "new" pattern which is to be learned, where each $p_j = \pm 1$. For a small learning rate $\epsilon > 0$, the old weights are updated to new values using the rule \[ \omega _{ij}^{new}=\omega _{ij}^{old}+\varepsilon p_{i}p_{j} \] Commutivity of multiplication means that p_ip_j = p_jp_i, so that
\[ \omega _{ij}^{new}=\omega _{ji}^{new} \] thus preserving the symmetry of W. Let's look at the learning rule in matrix form. First, let us let \[ P=\left[ \begin{array}{c} p_{1} \\ p_{2} \\ \vdots \\ p_{n} \end{array} \right] \] denote the "pattern vector" to be learned by the network. Then \[ P\ P^{t}=\left[ \begin{array}{c} p_{1} \\ p_{2} \\ \vdots \\ p_{n} \end{array} \right] \left[ \begin{array}{cccc} p_{1}, & p_{2}, & \ldots , & p_{n} \end{array} \right] =\left[ \begin{array}{ccccc} p_{1}^{2} & p_{1}p_{2} & p_{1}p_{3} & \ldots & p_{1}p_{n} \\ p_{1}p_{2} & p_{2}^{2} & p_{2}p_{3} & \ldots & p_{2}p_{n} \\ p_{1}p_{3} & p_{2}p_{3} & p_{3}^{2} & \ldots & p_{3}p_{n} \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ p_{1}p_{n} & p_{2}p_{n} & p_{3}p_{n} & \ldots & p_{n}^{2} \end{array} \right] \] However, since $p_{j}=\pm 1$, the diagonal elements are $p_{j}^{2}=1$. Thus, to remove the diagonal elements, we simply subtract the identity matrix: \[ P\ P^{t}-I=\left[ \begin{array}{ccccc} 0 & p_{1}p_{2} & p_{1}p_{3} & \ldots & p_{1}p_{n} \\ p_{1}p_{2} & 0 & p_{2}p_{3} & \ldots & p_{2}p_{n} \\ p_{1}p_{3} & p_{2}p_{3} & 0 & \ldots & p_{3}p_{n} \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ p_{1}p_{n} & p_{2}p_{n} & p_{3}p_{n} & \ldots & 0 \end{array} \right] \] Thus, in matrix form the learning rule is given by \[ W^{new}=W^{old}+\varepsilon \left( PP^{t}-I\right) \]