We will use three hidden layers, with 256 neurons for the first layer, 128 for the second, and 64 for the third one. The following code snippet shows the architecture for the classification example:
n_inputs = 28*28
n_hidden1 = 350
n_hidden2 = 200
n_hidden3 = 100
n_outputs = 10