Forward Propagation
The forward pass in a neural network involves processing input data through the network’s layers to generate predictions or outputs by sequentially applying weights, biases, and activation functions.
Mathematical Representation:
For one training example
- Hidden Layer Computations:
- Weighted Sum Calculation:
- Compute the weighted sum of inputs (
) by the weights ( ) and add the bias ( ) for layer .
- Compute the weighted sum of inputs (
- Activation Calculation:
- Apply an activation function
(such as sigmoid, tanh, ReLU, etc.) to the computed weighted sum to get the activation of layer .
- Apply an activation function
- Weighted Sum Calculation:
- Output Layer Computations:
- Weighted Sum Calculation:
- Compute the weighted sum of inputs (
) by the weights ( ) and add the bias ( ) for the output layer.
- Compute the weighted sum of inputs (
- Activation Calculation:
- Apply an appropriate activation function (e.g., softmax for multiclass classification, sigmoid for binary classification) to the computed weighted sum to obtain the predicted output (
).
- Apply an appropriate activation function (e.g., softmax for multiclass classification, sigmoid for binary classification) to the computed weighted sum to obtain the predicted output (
- Weighted Sum Calculation:
- Cost Function:
- Compute the appropriate cost function (e.g., cross-entropy, mean squared error) to evaluate the difference between the predicted output (
) and the actual output ( ) for all training examples ( ).
- Compute the appropriate cost function (e.g., cross-entropy, mean squared error) to evaluate the difference between the predicted output (
Code
import numpy as np
def forward_propagation(X, parameters):
"""
Argument:
X -- input data of size (n_x, m)
parameters -- python dictionary containing your parameters (output of initialization function)
Returns:
A2 -- The sigmoid output of the second activation
cache -- a dictionary containing "Z1", "A1", "Z2" and "A2"
"""
= parameters["W1"], parameters["b1"], parameters["W2"], parameters["b2"]
W1, b1, W2, b2
= W1@X + b1
Z1 = np.tanh(Z1)
A1 = W2@A1 + b2
Z2 = 1/(1+np.exp(-Z2))
A2
= {"Z1": Z1,
cache "A1": A1,
"Z2": Z2,
"A2": A2}
return A2, cache