from Hacker News

The Forward-Forward Algorithm: Some Preliminary Investigations [pdf]

by jordn on 12/1/22, 9:36 PM with 10 comments

  • by CGamesPlay on 12/2/22, 11:33 AM

    I've only skimmed it, but the gist seems to be that on MNIST the FF algorithm is within 30% as effective as the classic backpropagation. I didn't quite follow in my quick reading how the network can generate its own negative data, but this seems like what the future research would be interested in. Section 8 seemed to appear out of the blue, and talks about alternative hardware models for machine learning.
  • by candiodari on 12/2/22, 2:17 PM

    TLDR: during training back propagation means we need to send data 2 ways: input data from input => output, and error data (wanted_result - output) from output => input, updating the weights of the network to work better.

    This is the main source of performance bottlenecks and also an obvious difference between natural and artificial neural networks. The brain does not even seem to have error signals, and we also can't seem to find any signal going in the opposite direction of propagation. It requires that you have an error signal ... and that means it requires knowing the right answer to the problem the algorithm is trying to solve. Also quite important to some companies: sending data in 2 directions through the network places serious limitations on parallelization of neural network training. It is one of the big causes that Google/Facebook/MS(OpenAI)/... only seem to have a 1y or less headstart over the rest of the industry, despite billions of investment.

    Forward - forward tries to do online learning by training the network to differentiate between real and fake-but-really-realistic signals with data going in the same direction every time.

  • by arc-in-space on 12/2/22, 9:12 AM

  • by gauddasa on 12/2/22, 2:24 PM

    What is being proposed appears to be Hebbian learning rule and the paper does not even mention that or the Hebb network, why? By the way, Hebbian learning rule was proposed in 1949 and is one of the pioneering work that demonstrated neuron-based models are worth investigation.
  • by joko42 on 12/2/22, 11:58 AM

    Would be great if there was code we could try on MNIST.
  • by KunzEgg on 12/7/22, 2:26 AM

    Can someone work with this?

    #using the Forward-Forward algorithm to train a neural network to classify positive and negative data #the positive data is real data and the negative data is generated by the network itself #the network is trained to have high goodness for positive data and low goodness for negative data #the goodness is measured by the sum of the squared activities in a layer #the network is trained to correctly classify input vectors as positive data or negative data #the probability that an input vector is positive is given by applying the logistic function, σ to the goodness, minus some threshold, θ #the negative data may be predicted by the neural net using top-down connections, or it may be supplied externally

    import numpy as np

    # Define the activation function and its derivative def activation(x): return np.maximum(0, x)

    def activation_derivative(x): return 1. * (x > 0)

    # Define the goodness function (the sum of the squared activities in a layer) def goodness(x): return np.sum(x*2)

    # Define the forward pass for the positive data def forward_pass_positive(X, W1, W2): # Forward pass a1 = activation(np.dot(X, W1)) a2 = activation(np.dot(a1, W2)) return a1, a2

    # Define the forward pass for the negative data def forward_pass_negative(X, W1, W2): # Forward pass a1 = activation(np.dot(X, W1)) a2 = activation(np.dot(a1, W2)) return a1, a2

    # Define the learning rate learning_rate = 0.01

    # Define the threshold for the goodness theta = 0.1

    # Define the number of epochs epochs = 100

    # Generate the positive data X = np.array([[0, 0, 1], [0, 1, 1], [1, 0, 1], [1, 1, 1]])

    # Generate the negative data Xn = np.array([[0, 0, 0], [0, 1, 0], [1, 0, 0], [1, 1, 0]])

    # Initialize the weights W1 = 2np.random.random((3, 4)) - 1 W2 = 2np.random.random((4, 1)) - 1

    # Perform the positive and negative passes for each epoch for j in range(epochs):

        # Forward pass for the positive data
        a1, a2 = forward_pass_positive(X, W1, W2)
    
        # Forward pass for the negative data
        a1n, a2n = forward_pass_negative(Xn, W1, W2)
    
        # Calculate the goodness for the positive data
        g1 = goodness(a1)
        g2 = goodness(a2)
    
        # Calculate the goodness for the negative data
        g1n = goodness(a1n)
        g2n = goodness(a2n)
    
        # Calculate the probability that the input vector is positive data
        p1 = 1/(1 + np.exp(-(g1 - theta)))
        p2 = 1/(1 + np.exp(-(g2 - theta)))
    
        # Calculate the probability that the input vector is negative data
        p1n = 1/(1 + np.exp(-(g1n - theta)))
        p2n = 1/(1 + np.exp(-(g2n - theta)))
    
        # Calculate the error for the positive data
        error2 = p2 - 1
        error1 = p1 - 1
    
        # Calculate the error for the negative data
        error2n = p2n - 0
        error1n = p1n - 0
    
        # Calculate the delta for the positive data
        delta2 = error2 * activation_derivative(a2)
        delta1 = error1 * activation_derivative(a1)
    
        # Calculate the delta for the negative data
        delta2n = error2n * activation_derivative(a2n)
        delta1n = error1n * activation_derivative(a1n)
    
        # Calculate the change in the weights for the positive data
        dW2 = learning_rate * a1.T.dot(delta2)
        dW1 = learning_rate * X.T.dot(delta1)
    
        # Calculate the change in the weights for the negative data
        dW2n = learning_rate * a1n.T.dot(delta2n)
        dW1n = learning_rate * Xn.T.dot(delta1n)
    
        # Update the weights for the positive data
        W2 += dW2
        W1 += dW1
    
        # Update the weights for the negative data
        W2 += dW2n
        W1 += dW1n
    
    # Print the weights print("W1 = ", W1) print("W2 = ", W2)

    # Print the goodness for the positive data print("g1 = ", g1) print("g2 = ", g2)

    # Print the goodness for the negative data print("g1n = ", g1n) print("g2n = ", g2n)

    # Print the probability that the input vector is positive data print("p1 = ", p1) print("p2 = ", p2)

    # Print the probability that the input vector is negative data print("p1n = ", p1n) print("p2n = ", p2n)

  • by KunzEgg on 12/7/22, 2:26 AM

    #using the Forward-Forward algorithm to train a neural network to classify positive and negative data #the positive data is real data and the negative data is generated by the network itself #the network is trained to have high goodness for positive data and low goodness for negative data #the goodness is measured by the sum of the squared activities in a layer #the network is trained to correctly classify input vectors as positive data or negative data #the probability that an input vector is positive is given by applying the logistic function, σ to the goodness, minus some threshold, θ #the negative data may be predicted by the neural net using top-down connections, or it may be supplied externally

    import numpy as np

    # Define the activation function and its derivative def activation(x): return np.maximum(0, x)

    def activation_derivative(x): return 1. * (x > 0)

    # Define the goodness function (the sum of the squared activities in a layer) def goodness(x): return np.sum(x*2)

    # Define the forward pass for the positive data def forward_pass_positive(X, W1, W2): # Forward pass a1 = activation(np.dot(X, W1)) a2 = activation(np.dot(a1, W2)) return a1, a2

    # Define the forward pass for the negative data def forward_pass_negative(X, W1, W2): # Forward pass a1 = activation(np.dot(X, W1)) a2 = activation(np.dot(a1, W2)) return a1, a2

    # Define the learning rate learning_rate = 0.01

    # Define the threshold for the goodness theta = 0.1

    # Define the number of epochs epochs = 100

    # Generate the positive data X = np.array([[0, 0, 1], [0, 1, 1], [1, 0, 1], [1, 1, 1]])

    # Generate the negative data Xn = np.array([[0, 0, 0], [0, 1, 0], [1, 0, 0], [1, 1, 0]])

    # Initialize the weights W1 = 2np.random.random((3, 4)) - 1 W2 = 2np.random.random((4, 1)) - 1

    # Perform the positive and negative passes for each epoch for j in range(epochs):

        # Forward pass for the positive data
        a1, a2 = forward_pass_positive(X, W1, W2)
    
        # Forward pass for the negative data
        a1n, a2n = forward_pass_negative(Xn, W1, W2)
    
        # Calculate the goodness for the positive data
        g1 = goodness(a1)
        g2 = goodness(a2)
    
        # Calculate the goodness for the negative data
        g1n = goodness(a1n)
        g2n = goodness(a2n)
    
        # Calculate the probability that the input vector is positive data
        p1 = 1/(1 + np.exp(-(g1 - theta)))
        p2 = 1/(1 + np.exp(-(g2 - theta)))
    
        # Calculate the probability that the input vector is negative data
        p1n = 1/(1 + np.exp(-(g1n - theta)))
        p2n = 1/(1 + np.exp(-(g2n - theta)))
    
        # Calculate the error for the positive data
        error2 = p2 - 1
        error1 = p1 - 1
    
        # Calculate the error for the negative data
        error2n = p2n - 0
        error1n = p1n - 0
    
        # Calculate the delta for the positive data
        delta2 = error2 * activation_derivative(a2)
        delta1 = error1 * activation_derivative(a1)
    
        # Calculate the delta for the negative data
        delta2n = error2n * activation_derivative(a2n)
        delta1n = error1n * activation_derivative(a1n)
    
        # Calculate the change in the weights for the positive data
        dW2 = learning_rate * a1.T.dot(delta2)
        dW1 = learning_rate * X.T.dot(delta1)
    
        # Calculate the change in the weights for the negative data
        dW2n = learning_rate * a1n.T.dot(delta2n)
        dW1n = learning_rate * Xn.T.dot(delta1n)
    
        # Update the weights for the positive data
        W2 += dW2
        W1 += dW1
    
        # Update the weights for the negative data
        W2 += dW2n
        W1 += dW1n
    
    # Print the weights print("W1 = ", W1) print("W2 = ", W2)

    # Print the goodness for the positive data print("g1 = ", g1) print("g2 = ", g2)

    # Print the goodness for the negative data print("g1n = ", g1n) print("g2n = ", g2n)

    # Print the probability that the input vector is positive data print("p1 = ", p1) print("p2 = ", p2)

    # Print the probability that the input vector is negative data print("p1n = ", p1n) print("p2n = ", p2n)