Gerald Monroe comments on AI #52: Oops

Gerald Monroe 23 Feb 2024 14:53 UTC
2 points
0
Given that humans on their own haven’t yet found these better architectures, humans + imitative AI doesn’t seem like it would find the problem trivial.
Humans on their own already did invent better RL algorithms for optimizing at the network architecture layer.
https://arxiv.org/pdf/1611.01578.pdf background
https://arxiv.org/pdf/1707.07012.pdf : page 6
With NAS, the RL model learns the relationship between [architecture] and [predicted performance]. I’m not sure how much transfer learning is done, but you must sample the problem space many times.
You get a plot of all your tries like above, the red dot is the absolute max for human designed networks by the DL experts at Waymo in 2017.
Summary: I was speculating that a more advanced version of an existing technique might work
Tweaking it’s own mutation rate and child count and other hyperparameters. But it’s not going to invent gradient based methods
It can potentially output any element in it’s library of primitives. The library of primitives you get by the “mediocre answers” approach you criticized to re-implement every machine learning paper without code ever published, and you also port all code papers and test them in a common environment. Also the IT staff you would otherwise need, and other roles, is being filled in by AI.
Is a thing that happens. But it needs quite a lot of intelligence to start. Quite possibly more intelligence than needed to automate most of the economy.
No, it is an improved example of existing RL algorithms, trained on the results from testing network and cognitive architectures on a proxy task suite. The proxy task suite is a series of cheaper to run tests, on smaller networks, that predict the ability of a full scale network on your full AGI/ASI gym.
This is subhuman intelligence, but the RL network (or later hybrid architectures) learns from a broader set of results than humans are.
This is kind of true. But by the time there are no big algorithmic wins left, we are in the crazy smart, post singularity regime.
The above method won’t settle on infinity, remember you are still limited by compute, remember this is 5-10 years from today. You find a stronger AGI/ASI than before.
Since you
a. sampled the possibility space a finite number of times
b. proxy tasks can’t have perfect correlation to full scale scores,
c. by ‘only’ starting with ’every technique ever tried by humans or other AI’s (your primitives library) you are searching a subset of the search space
d. Hardware architecture shrinks the space to algorithms the (training) hardware architecture supports well.
e. The noisiness in available training data limits the effectiveness of any model or cognitive algorithm
f. You can’t train an algorithm larger than you can inference
g. You want an algorithm that supports real time hardware
Well the resulting ASI will only be so powerful. Still, the breadth of training data, lack of hardware errors, and much larger working memory should go pretty far...
Summary: I got over a list of reasons why the described technique won’t find the global maximum which would be the strongest superintelligence the underlying hardware can support. Strength is the harmonic mean of scores on your evaluation benchmark, which is ever growing.
Although I uh forgot a step, I’m describing the initial version of the intelligence search algorithm built by humans, and why is saturates. Later on you just add a test to your ASI gym for the ASI being tested to design a better intelligence, and just give it all the data from the prior runs. Still the ASI is going to be limited by (a, d, e, f, g)