Forward Propagation

The forward pass in a neural network involves processing input data through the network’s layers to generate predictions or outputs by sequentially applying weights, biases, and activation functions.

Mathematical Representation:

For one training example x(i):

  1. Hidden Layer Computations:
    • Weighted Sum Calculation:
      • z[l](i)=W[l]a[l1](i)+b[l]
        • Compute the weighted sum of inputs (a[l1](i)) by the weights (W[l]) and add the bias (b[l]) for layer l.
    • Activation Calculation:
      • a[l](i)=g(z[l](i))
        • Apply an activation function g (such as sigmoid, tanh, ReLU, etc.) to the computed weighted sum to get the activation of layer l.
  2. Output Layer Computations:
    • Weighted Sum Calculation:
      • z[L](i)=W[L]a[L1](i)+b[L]
        • Compute the weighted sum of inputs (a[L1](i)) by the weights (W[L]) and add the bias (b[L]) for the output layer.
    • Activation Calculation:
      • y^(i)=a[L](i)=σ(z[L](i))
        • Apply an appropriate activation function (e.g., softmax for multiclass classification, sigmoid for binary classification) to the computed weighted sum to obtain the predicted output (y^(i)).
  3. Cost Function:
    • J=1mi=0m(y(i)log(a[L](i))+(1y(i))log(1a[L](i)))
      • Compute the appropriate cost function (e.g., cross-entropy, mean squared error) to evaluate the difference between the predicted output (a[L](i)) and the actual output (y(i)) for all training examples (m).
import numpy as np

def forward_propagation(X, parameters):
    """
    Argument:
    X -- input data of size (n_x, m)
    parameters -- python dictionary containing your parameters (output of initialization function)
    
    Returns:
    A2 -- The sigmoid output of the second activation
    cache -- a dictionary containing "Z1", "A1", "Z2" and "A2"
    """
    W1, b1, W2, b2 = parameters["W1"], parameters["b1"], parameters["W2"], parameters["b2"]
    
    Z1 = W1@X + b1
    A1 = np.tanh(Z1)
    Z2 = W2@A1 + b2
    A2 = 1/(1+np.exp(-Z2))
    
    cache = {"Z1": Z1,
             "A1": A1,
             "Z2": Z2,
             "A2": A2}
    
    return A2, cache