6. Softmax

MediumActivation FunctionsNeural NetworksNumPy

Implement the softmax function, which converts a vector of raw scores (logits) into a probability distribution.

$\text{softmax}(x_i) = \frac{e^{x_i}}{\sum_{j=1}^{n} e^{x_j}}$

The output values all lie in $(0, 1)$ and sum to exactly $1$ .

Numerical stability: A naive implementation can overflow for large $x_i$ . The standard trick is to subtract the maximum value before exponentiating:

$\text{softmax}(x_i) = \frac{e^{x_i - \max(\mathbf{x})}}{\sum_{j} e^{x_j - \max(\mathbf{x})}}$

This is mathematically equivalent but avoids inf in floating point.

Example

Input:  [1.0, 2.0, 3.0]
Output: [0.0900, 0.2447, 0.6652]   # sums to 1.0

Constraints

Your implementation should be numerically stable.
Return a NumPy array of the same shape as the input.

Python 3

⌘ + Enter

▶

Run your code to see results

Ctrl + Enter