← Problems

6. Softmax

MediumActivation FunctionsNeural NetworksNumPy

Implement the softmax function, which converts a vector of raw scores (logits) into a probability distribution.

softmax(xi)=exij=1nexj\text{softmax}(x_i) = \frac{e^{x_i}}{\sum_{j=1}^{n} e^{x_j}}

The output values all lie in (0,1)(0, 1) and sum to exactly 11.

Numerical stability: A naive implementation can overflow for large xix_i. The standard trick is to subtract the maximum value before exponentiating:

softmax(xi)=eximax(x)jexjmax(x)\text{softmax}(x_i) = \frac{e^{x_i - \max(\mathbf{x})}}{\sum_{j} e^{x_j - \max(\mathbf{x})}}

This is mathematically equivalent but avoids inf in floating point.

Example

Input: [1.0, 2.0, 3.0] Output: [0.0900, 0.2447, 0.6652] # sums to 1.0

Constraints

  • Your implementation should be numerically stable.
  • Return a NumPy array of the same shape as the input.
Python 3
⌘ + Enter
Run your code to see results
Ctrl + Enter