from Hacker News

Swish: A Self-Gated Activation Function

by goberoi on 10/30/17, 5:40 PM with 1 comments

  • by goberoi on 10/30/17, 5:48 PM

    Why is this interesting? In short: a great new activation function that may challenge the dominance of ReLU.

    Longer story:

    Today, ReLU is the most popular activation function for deep networks (along with its variants like leaky ReLU or parametric ReLU).

    This paper from the Google Brain team is ~2 weeks old, and shows that SWISH, a new activation function, "improves top-1 classification accuracy on ImageNet by 0.9% for Mobile NASNetA and 0.6% for Inception-ResNet-v2" by simply replacing ReLU with SWISH.

    SWISH is equal to x * sigmoid(x), so not that much harder to compute either.