**Limitations of Perceptrons**

- Perceptrons have a monotinicity property: - If a link has positive weight, activation can only increase as the corresponding input value increase (irrespective of other input values)
- Can't represent functions where input interactions can cancel one another's effect (e.g. XOR)
- Can represent only linearly separable functions.

**A Solution : multiple layers**

**Power / Expressiveness of Multi-layer Networks**

- Can represent interactions among inputs
- Two layer networks can represent any Boolean function, and continuous functions (within a tolerance) as long as the number of hidden units is sufficient and appropriate activation functions used
- Learning algorithms exist, but weaker guarantees than perceptron learning algorithms

**Multilayer Network**

**Two-layer back-propagation neural network**

**The back-propagation training algorithm**

- Step 1: Installation
- Set all the weights and threshold levels of the network to random numbers uniformly distributed inside a small range

**Backprop**

**Initialization**

- Set all the weights and threshold levels of the network to random numbers uniformly distributed inside a small range

**Forward computing :**

- Compute activation / output vector z on hidden layer

zj = Φ (∑i 𝒱ij𝒳i)

- Compute the output vector y on output layer

yk = Φ (∑i 𝒲ik𝒳j)

y is the result of the computation

**Learning for BP Nets**

- Update of weights in W (between output and hidden layers): - delta rule
- Not applicable to updating V (between input and hidden ) - don't know the target values for hidden units z1, z2, ..., zp
- Solution : Propagate errors at output units to hidden units to drive the update of weight in V (again by delta rule) (error BACK-PROPAGATION learning)
- Error back propagation can be continued downward if the net has more than one hidden layer.
- How to compute errors on hidden units?

**Derivation**

## No comments:

## Post a Comment