Delta rule

In machine learning, the delta rule is a gradient descent learning rule for updating the weights of the inputs to artificial neurons in a single-layer neural network.[1] It can be derived as the backpropagation algorithm for a single-layer neural network with mean-square error loss function.

For a neuron with activation function , the delta rule for neuron 's -th weight is given by

where

  • is a small constant called learning rate
  • is the neuron's activation function
  • is the derivative of
  • is the target output
  • is the weighted sum of the neuron's inputs
  • is the actual output
  • is the -th input.

It holds that and .

The delta rule is commonly stated in simplified form for a neuron with a linear activation function as

While the delta rule is similar to the perceptron's update rule, the derivation is different. The perceptron uses the Heaviside step function as the activation function , and that means that does not exist at zero, and is equal to zero elsewhere, which makes the direct application of the delta rule impossible.

  1. ^ Russell, Ingrid. "The Delta Rule". University of Hartford. Archived from the original on 4 March 2016. Retrieved 5 November 2012.