Part 1

(a) As discussed in class, Dense() layers of neural networks have both a Weight matrix and a bias vector.

If an input vector has 1024 features, and there are 512 neurons (units) in a Dense() layer, how many parameters are there in the Dense layer?

If all else fails, try running this code ...

from tensorflow.keras import models, layers
model = models.Sequential()
model.add(layers.Dense(512, input_shape = (1024,), activation = "relu"))
model.summary()

The size of the weight matrix is 1024 x 512, while the size of the bias vector is 512; so the total number of parameters is 524,800 [1024*512 + 512].


(b) How many of the parameters from (a) are bias parameters?

There are 512 bias parameters.


Part 2

In class, we said an activation function is a linear function if-and-only-if ...
f(x1 + x2) = f(x1) + f(x2)
f(c * x1) = c * f(x1)
... for all values of variables x1 and x2, and all values of a constant c.

The rectified linear unit function is defined as relu(x) = max(0, x).

Is the relu() function a linear function?

Why or why not?

Let's suppose c = -1, x1 = 1, and x2 = -1 ...
relu(x1 + x2) = relu(1 + (-1)) = relu(0) = 0  ... which is not equal to ...  relu(1) + relu(-1) = 1 + 0 = 1
relu(c * x1) = relu(-1 * 1) = relu(-1) = 0    ... which is not equal to ...  c * relu(x1) = -1 * relu(1) = -1 * 1 = -1
So the relu function fails both tests; i.e. the relu function is not a linear function.