I’m actually woking on an AI progress timeline / alignment failure story where the big risk comes from BCI-enabled coordination tech (I’ve sent you the draft if you’re interested). I.e., instead of developing superintelligence, the timeline develops models that can manipulate mood/behavior through a BCI, initially as a cure for depression, then gradually spreading through society as a general mood booster / productivity enhancer, and finally being used to enhance coordination (e.g., make everyone super dedicated to improving company profits without destructive internal politics). The end result is that coordination models are trained via reinforcement learning to maximize profits or other simple metrics and gradually remove non-optimal behaviors in pursuit of those metrics.
This timeline makes the case that AI doesn’t need to be superhuman to pose a risk. The behavior modifying models manipulate brains through BCIs with far fewer electrodes than the brain has neurons and are much less generally capable than human brains. We already have a proof of concept that a similar approach can cure depression, so I think more complex modifications like loyalty/motivation enhancement are possible in the not too distant future.
You may also find the section of my timeline addressing progress standard in AI interesting:
My rough mental model for AI capabilities is that they depend on three inputs:
Compute per dollar. This increases at a somewhat sub-exponential rate. The time between 10x increases is increasing. We were initially at ~10x increase every four years, but recently slowed to ~10x increase every 10-16 years (source).
Algorithmic progress in AI. Each year, the compute required to reach a given performance level drops by a constant factor, (so far, a factor of 2 every ~16 months) (source). I think improvements to training efficiency drive most of the current gains in AI capabilities, but they’ll eventually begin falling off as we exhaust low hanging fruit.
The money people are willing to invest in AI. This increases as the return on investment in AI increases. There was a time when money invested in AI rose exponentially and very fast, but it’s pretty much flattened off since GPT-3. My guess is this quantity follows a sort of stutter-stop pattern where it spikes as people realize algorithmic/hardware improvements make higher investments in AI more worthwhile, then flattens once the new investments exhaust whatever new opportunities progress in hardware/algorithms allowed.
When you combine these somewhat sub-exponentially increasing inputs with the power-law scaling laws so far discovered (see here), you probably get something roughly linear, but with occasional jumps in capability as willingness to invest jumps.
I think there’s a reasonable case that AI progress will continue at approximately the same trajectory as it has over the last ~50 years.
What metric would you use to capture the trajectory of AI progress over the last 50 years? And would such a metric be able to bridge the transition from GOFAI to deep learning?
My preferred algorithmic metric would be compute required to reach a certain performance level. This doesn’t really work for hand-crafted expert systems. However, I don’t think those are very informative of future AI trajectories.
I’m actually woking on an AI progress timeline / alignment failure story where the big risk comes from BCI-enabled coordination tech (I’ve sent you the draft if you’re interested). I.e., instead of developing superintelligence, the timeline develops models that can manipulate mood/behavior through a BCI, initially as a cure for depression, then gradually spreading through society as a general mood booster / productivity enhancer, and finally being used to enhance coordination (e.g., make everyone super dedicated to improving company profits without destructive internal politics). The end result is that coordination models are trained via reinforcement learning to maximize profits or other simple metrics and gradually remove non-optimal behaviors in pursuit of those metrics.
This timeline makes the case that AI doesn’t need to be superhuman to pose a risk. The behavior modifying models manipulate brains through BCIs with far fewer electrodes than the brain has neurons and are much less generally capable than human brains. We already have a proof of concept that a similar approach can cure depression, so I think more complex modifications like loyalty/motivation enhancement are possible in the not too distant future.
You may also find the section of my timeline addressing progress standard in AI interesting:
I think there’s a reasonable case that AI progress will continue at approximately the same trajectory as it has over the last ~50 years.
What metric would you use to capture the trajectory of AI progress over the last 50 years? And would such a metric be able to bridge the transition from GOFAI to deep learning?
My preferred algorithmic metric would be compute required to reach a certain performance level. This doesn’t really work for hand-crafted expert systems. However, I don’t think those are very informative of future AI trajectories.