You don’t think you would need to evaluate a large number of “ASI candidates” to find an architecture that scales to superintelligence? Meaning I am saying you can describe every choice you make in architecture as single string, or “search space coordinate”. You would use a smaller model and proxy tasks, but you still need to train and evaluate each smaller model.
All these failures might eat a lot of compute, how many failures do you think we would have? What if it was 10,000 failures and we need to reach gpt-4 scale to evaluate?
Also, would “idiosyncratic interconnect” limit what tasks the model is superintelligent at? This would seem to imply a limit on how much information can be considered in one context. This might leave the model less than superintelligent at very complex, coupled tasks like “keep this human patient alive” while less coupled tasks like “design this IC from scratch” would work. (The chip design task is less coupled because you can subdivide into modules separated by interfaces and use separate ASI sessions for each module design)
You don’t think you would need to evaluate a large number of “ASI candidates” to find an architecture that scales to superintelligence? Meaning I am saying you can describe every choice you make in architecture as single string, or “search space coordinate”. You would use a smaller model and proxy tasks, but you still need to train and evaluate each smaller model.
All these failures might eat a lot of compute, how many failures do you think we would have? What if it was 10,000 failures and we need to reach gpt-4 scale to evaluate?
Also, would “idiosyncratic interconnect” limit what tasks the model is superintelligent at? This would seem to imply a limit on how much information can be considered in one context. This might leave the model less than superintelligent at very complex, coupled tasks like “keep this human patient alive” while less coupled tasks like “design this IC from scratch” would work. (The chip design task is less coupled because you can subdivide into modules separated by interfaces and use separate ASI sessions for each module design)