softmax
Applies softmax along the provided axis, rescaling the tensor so that every slice along the provided axis is in the range 0,1 and sums to 1.
Output shape is the same as the input shape
Also see https://towardsdatascience.com/softmax-activation-function-how-it-actually-works-d292d335bd78