If the pie is bigger, the only possible problem is bad politics. There is no technical AI challenge here. There might be a technical economical problem. It’s anyhow unrelated to the skill set of AI people. Bundling is not good, and this article bundles economic and political problems into AI alignment.
Rudi C
Edge AI is the only scenario where AI can self replicate and be somewhat self sufficient without a big institution though? It’s bad for AI dominion risk, good for political centralization risk.
I’ve long taken to using GreaterWrong. Give it a try, lighter and more featureful.
But the outside view on LLM hitting a wall and being a “stochastic parrot” is true? GPT4O has been weaker and cheaper than GPT4T in my experience, and the same is true w.r.t. GPT4T vs. GPT4. The two versions of GPT4 seem about the same. Opus is a bit stronger than GPT4, but not by much and not in every topic. Both Opus and GPT4 exhibit patterns of being a stochastic autocompleter, and not a logician. (Humans aren’t that much better, of course. People are terrible at even trivial math. Logic and creativity are difficult.) DallE etc. don’t really have an artistic sense, and still need prompt engineering to produce beautiful art. Gemini 1.5 Pro is even weaker than GPT4, and I’ve heard Gemini Ultra has been retired from public access. All of these models get worse as their context grows, and their grasp of long range dependencies is terrible.
The pace is of course still not too bad compared with other technologies, but there doesn’t seem to be any long-context “Q*” GPT5s in store, from any company.
PS: Does lmsys do anything to control for the speed effect? GPT4O is very fast, and that alone should be responsible for many ELOs.
Persuasive AI voices might just make all voices less persuasive. Modern life is full of these fake super stimulants anyway.
Can you create a podcast of posts read by AI? It’s difficult to use otherwise.
Can you create a podcast of posts read by AI? It’s difficult to use otherwise.
I doubt this. Test-based admissions don’t benefit from tutoring (in the highest percentiles, compared to less hours of disciplined self-study) IMO. We Asians just like to optimize the hell of them, and most parents aren’t sure if tutoring helps or not, so they register their children for many extra classes. Outside of the US, there aren’t that many alternative paths to success, and the prestige of scholarship is also higher.
Also, tests are somewhat robust to Goodharting, unlike most other measures. If the tests eat your childhood, you’ll at least learn a thing or two. I think this is because the Goodharting parts are easy enough that all the high-g people learn them quickly in the first years of schooling, so the efforts are spent just learning the material by doing more advanced exercises. Solving multiple-choice math questions by “wrong” methods that only work for multiple-choice questions is also educational and can come in handy during real work.
AGI might increase the risk of totalitarianism. OTOH, a shift in the attack-defense balance could potentially boost the veto power of individuals, so it might also work as a deterrent or a force for anarchy.
This is not the crux of my argument, however. The current regulatory Overton window seems to heavily favor a selective pause of AGI, such that centralized powers will continue ahead, even if slower due to their inherent inefficiencies. Nuclear development provides further historical evidence for this. Closed AGI development will almost surely lead to a dystopic totalitarian regime. The track record of Lesswrong is not rosy here; the “Pivotal Act” still seems to be in popular favor, and OpenAI has significantly accelerated closed AGI development while lobbying to close off open research and pioneering the new “AI Safety” that has been nothing but censorship and double-think as of 2024.
A core disagreement is over “more doomed.” Human extinction is preferable to a totalitarian stagnant state. I believe that people pushing for totalitarianism have never lived under it.
ChatGPT isn’t a substitute for a NYT subscription. It wouldn’t work at all without browsing. It would probably get blocked with browsing enabled, both by NYT through its useragent, and by OpenAI’s “alignment.” Even if it doesn’t get blocked, it would be slower than skimming the article manually, and its output not trustable.
OTOH, NYT can spend pennies to have an AI TLDR at the top of each of their pages. They can even use their own models, as semanticscholar does. Anybody who is economical enough to prefer the much worse experience of ChatGPT, would not have paid NYT in the first place. You can bypass the paywall trivially.
In fact, why don’t NYT authors write a TLDR themselves? Most of their articles are not worth reading. Isn’t the lack of a summary an anti-user feature to artificially inflate their offering’s volume?
NYT would, if anything, benefit from LLMs potentially degrading the average quality of the competing free alternatives.
The counterfactual version of GPT4 that did not have NYT in its training is extremely unlikely to have been a worse model. It’s like removing sand from a mountain.
The whole case is an example of rent-seeking post-capitalism.
This is unrealistic. It assumes:
Orders of magnitude more intelligence
The actual usefulness of such intelligence in the physical world with its physical limits
The more worrying prospect is that the AI might not necessarily fear suicide. Suicidal actions are quite prevalent among humans, after all.
In estimated order of importance:
Just trying harder for years to build better habits (i.e., not giving up on boosting my productivity as a lost cause)
Time tracking
(Trying to) abandon social media
Exercising (running)
Having a better understanding of how to achieve my goals
Socializing with more productive people
Accepting real responsibilities that makes me accountable to other people
Keeping a daily journal of what I have spent each day doing (high-level as opposed to the low-level time tracking above)
The first two seem the fundamental ones, really. Some of the rest naturally follow from those two (for me).
This is not an “error” per se. It’s a baseline, outside-view argument presented in lay terms.
Is there an RSS feed for the podcast? Spotify is a bad player in podcasts, trying to centralize and subsequently monopolize the market.
This post has good arguments, but it mixes in a heavy dose of religious evangelism and narcissism which retracts from its value.
The post can be less controversial and “culty” if it drops its second-order effect speculations, its value judgements, and it just presents a case that focusing on other technical areas of safety research is underrepresented. Focusing on non-technical work needs to be a whole other post, as it’s completely unrelated to interp.
The prior is that dangerous AI will not happen in this decade. I have read a lot of arguments here for years, and I am not convinced that there is a good chance that the null hypothesis is wrong.
GPT4 can be said to be an AGI already. But it’s weak, it’s slow, it’s expensive, it has little agency, and it has already used up high-quality data and tricks such as ensembling. 4 years later, I expect to see GPT5.5 whose gap with GPT4 will be about the gap between GPT4 and GPT3.5. I absolutely do not expect the context window problem to get solved in this timeframe or even this decade. (https://arxiv.org/abs/2307.03172)
Taboo dignity.
Another important problem is that while x-risk is speculative and relatively far off, rent-seeking and exploitation are rampant and everpresent. These regulations will make the current ailing politico-economic system much worse to the detriment of almost everyone. In our history, giving tribute in exchange for safety has usually been a terrible idea.
If they were to exclude all documents with the canary, everyone would include the canary to avoid being scraped.