mishka comments on Work dumber not smarter

mishka Jun 2, 2023, 3:14 AM
1 point
0

ReLU activation is the stupidest ML idea I’ve ever heard; everyone knows sigmoid um somehow feels optimal you know it is a real function from like real math. (ReLU only survived because it got a ridiculous acronym word thing and sounds complicated so you feel smart.)

No, ReLU is great, because it induces semantically meaningful sparseness (for the same geometric reason which causes L1-regularization to induce sparseness)!

It’s a nice compromise between the original perceptron stepfunction (which is incompatible with gradient methods) and the sigmoids which have tons of problems (saturate unpleasantly on the ends and don’t want to move from there).

What’s dumb is that instead of discovering the goodness of ReLU in the early 1970-s (natural timeline, given that ReLU has been introduced in the late 1960-s and, in any case, is very natural, being the integral of the step function), people had only discovered the sparseness-inducing properties of ReLU in 2000, published that in Nature of all places, and it was still ignored completely for another decade, and only after people published 3 papers of more applied flavor in 2009-2011, it was adopted, and by 2015 it overcame sigmoids as the most popular activation function in use, because it worked so much better. (See https://en.wikipedia.org/wiki/Rectifier_(neural_networks) for references.)

It’s quite likely that without ReLU AlexNet would not be able to improve the SOTA as spectacularly as it did, triggering the “first deep learning revolution”.

That being said, it is better to use them in pairs (relu(x), relu(-x)); this way you always get signal (e.g. TensorFlow has crelu function which is exactly this pair of relu’s).
- lemonhope Jun 2, 2023, 5:51 PM
  3 points
  1
  Parent
  Of course ReLU is great!! I was trying to say that if I were a 2009 ANN researcher (unaware of prior ReLU uses like most people probably were at the time) and someone (who had not otherwise demonstrated expertise) came in and asked why we use this particular woosh instead of a bent line or something, then I would’ve thoroughly explained the thought out of them. It’s possible that I would’ve realized how it works but very unlikely IMO. But a dumbworker more likely to say “Go do it. Now. Go. Do it now. Leave. Do it.” as I see it.

Keyboard shortcuts

Keys shown in yellow (e.g., ]) are accesskeys, and require a browser-specific modifier key (or keys).

Keys shown in grey (e.g., ?) do not require any modifier keys.

General
? Show keyboard shortcuts
Esc Hide keyboard shortcuts

Site navigation
h Go to Home (a.k.a. “Frontpage”) view
f Go to Featured (a.k.a. “Curated”) view
a Go to All (a.k.a. “Community”) view
m Go to Meta view
v Go to Tags view
c Go to Recent Comments view
r Go to Archive view
q Go to Sequences view
t Go to About page
u Go to User or Login page
o Go to Inbox page

Page navigation
, Jump up to top of page
. Jump down to bottom of page
/ Jump to top of comments section
s Search

Page actions
n New post or comment
e Edit current post

Post/comment list views
. Focus next entry in list
, Focus previous entry in list
; Cycle between links in focused entry
Enter Go to currently focused entry
Esc Unfocus currently focused entry
] Go to next page
[ Go to previous page
\ Go to first page
e Edit currently focused post

Editor
k Bold text
i Italic text
l Insert hyperlink
q Blockquote text

Appearance
= Increase text size
- Decrease text size
0 Reset to default text size
′ Cycle through content width settings
1 Switch to default theme [A]
2 Switch to dark theme [B]
3 Switch to grey theme [C]
4 Switch to ultramodern theme [D]
5 Switch to simple theme [E]
6 Switch to brutalist theme [F]
7 Switch to ReadTheSequences theme [G]
8 Switch to classic Less Wrong theme [H]
9 Switch to modern Less Wrong theme [I]
; Open theme tweaker
Enter Save changes and close theme tweaker
Esc Close theme tweaker (without saving)

Slide shows
l Start/resume slideshow
Esc Exit slideshow
→↓ Next slide
←↑ Previous slide
Space Reset slide zoom

Miscellaneous
x Switch to next view on user page
z Switch to previous view on user page
` Toggle compact comment list view
g Toggle anti-kibitzer