Skip to content

Experiments with Direct Feedback Alignment training scheme for DNNs

Notifications You must be signed in to change notification settings

iacolippo/Direct-Feedback-Alignment

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Direct-Feedback-Alignment

Understanding the general framework

In dfa-linear-net.ipynb, I show how a neural network without activation function can learn a linear function (multiplication by a matrix) using direct feedback alignment (DFA), as in Nøkland, 2016. There is also some theory about it.

In dfa-mnist.ipynb, I show how a neural network trained with DFA achieves very similar results to one trained with backpropagation. The architecture is very simple: one hidden layer of 800 Tanh units, sigmoid in the last layer and binary crossentropy loss.

Go to the last lines of mlp-torch-results.txt if you want to see the results of the same architecture using Torch code provided by Nøkland.

Stacking neural networks

Do networks with different feedback matrices learn different features at least in the first few steps? Apparently yes. Stacking works training a lot of weak learners on recognizing different features and using their outputs as inputs for a new model, which will learn how to combine these weak learners and give a performance boost.

In Stacking-dfa-nets folder, you have the following files. They must be executed in the following order:

  1. create_dataset.py: preprocess MNIST data loaded from Keras and save them to a Numpy file mnist.npz, ready to be used.
  2. weak-learners.py or diff-weak-learners.py: train as many weak learners as you want (NNs with one hidden layer 800 Tanh units). The difference between the first and the second is that the first trains all of them starting from the same initialization, while the second initializes each one of them in a different state. They generate respectively files called: train_linouts.npz & test_linouts.npz, diff-train_linouts.npz & diff-test_linouts.npz
  3. stacked-model.py or RD-stacked-model.py: train respectively a dense or an RD layer on top of the features extracted by each weak learner. The program takes as input the names of the files generated by the previous steps and the number of weak learners given in the previous step.

Example call to train 50 weak learners:

python weak_learners.py 50

Example call to train a stacked model on top of 50 weak learners:

python weak_learners.py 50 train_linouts.npz test_linouts.npz

RD Layers

Layers with a linear number of parameters vaguely inspired by ACDC. Basically they do these operations:

Equation

where D1 and D2 are diagonal matrices and R is a random matrix.

Requirements

  • numpy
  • matplotlib
  • scipy
  • keras
  • scikit-learn

About

Experiments with Direct Feedback Alignment training scheme for DNNs

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published