Here's a BackProp Algo definition from here:
Initially all the edge weights are randomly assigned. For every input in the training dataset, the ANN is activated and its output is observed. This output is compared with the desired output that we already know, and the error is “propagated” back to the previous layer. This error is noted and the weights are “adjusted” accordingly. This process is repeated until the output error is below a predetermined threshold.
Something I'm not understanding here: If inputs are fed one by one, and the weights are adjusted for each input, won't the NN essentially be trained for the last input?
Please clarify. Thank you.