Yeah. I vibecoded a few games, the new bottleneck was editing levels. I also vibecoded a level editor, the new bottleneck was coming up with good ideas for the levels.
Viliam
If we look at details, how exactly is the mini-city suppose to teach the school subjects?
Grammar—will the kids check each other’s grammar, or will they ignore it, or even develop some kind of pidgin?
Literature—if we reduce it to Paw Patrol and maybe Harry Potter, why not? Forget Shakespeare though.
Mathematics—arithmetic, maybe, but explain me how anything else is useful in the mini-city.
Geography—how is teaching about the world outside the mini-city relevant to the life in the mini-city.
Chemistry—will we let the kids experiment freely, until some of them get hurt by poison or explosions?
It does not make much sense from this perspective.
I agree that the kids might learn reading, writing, and arithmetic, but is that all we hope they will learn?
it feels like the highest-leverage thing I do.
How specifically do you plan to leverage it? (Maybe could be a full post, especially if you actually do it and get some interesting results.)
Just some guesses:
Our AI can parse documents with xx% accuracy [...] the people receiving these claims (primarily investors, but also buyers/ procurement teams) have no real means to reliably verify these claims
What they should do, in my opinion, is have a few actual documents prepared (maybe with actual names of people and companies replaced), and try them.
To make it easier for them, an AI could be used to evaluate the correctness of the answer. Make the AI under the answer a few questions related to the document. Then take its output, and use your current AI to check the answers—by giving it the output, and the human-written answers by your local expert, and asking whether the answers match, and to summarize the nature of disagreement (just in case the mismatches is caused by the tested AI providing a better or more nuanced answer).
However I suspect that in real world the actual mechanism that companies use to decide which products to buy is… much less related to the actual quality of the product, which makes this kind of testing irrelevant.
I know that calling some things “fraud” is exaggerated, but there should be some short word for: “your business strategy depends on your customers making some cognitive mistakes you are nudging them towards”.
And yeah, it’s not a yes-or-no questions, but it would be interesting to estimate how many % of income you would lose if your customers could compensate against those cognitive mistakes. (For example, if they had augmented reality glasses that would immediately calculate the actual useable volume, and then calculate the price per unit, etc.) There is something bad if the % is too high, in my opinion, even if it’s perfectly legal.
Christoph Heilig tests 18 different OpenAI models including GPT-5.4, finds all of them rate some forms of pseudo-literary nonsense generations higher than coherent prose in every case, and this is not getting better over time.
I couldn’t find examples of the highly rated poems. Is this analogical to those image generators that created the dog-shaped monstrosities under the assumption that “more eyes = more dog”?
If AI makes everyone twice as productive, one possibility is that everyone works half as hard. Another possibility is everyone works twice as hard, on top of being twice as productive, because half of people are about to get fired, or people see that you’ve completed your other work faster and suddenly give you more tasks.
This is the usual backward-bending labor supply curve, only typically we see it from the perspective of “what happens if we start paying people more”, but this time we see it from the opposite end.
Before the age of massive unemployment and starvation, there will be the age of massive voluntary overtime and slavery as people try to avoid falling into the former. Say goodbye to 40 hour workweeks, but not in the sense that Keynes predicted.
The standard cope for “what happens if 1 person is able to do the work of 100 using an AI?” is “try to be that 1 person, and you can get super rich”. But it could be more like “what happens if 1 person is able to do the work of 100, but there are 3 out of the 100 people who are able to do that?”, and suddenly you may find yourself competing not just for productivity but also for low cost and voluntary overtime.
Echo chambers are pretty obvious and have existed long before social media
Yeah, internet gives you a wide choice of echo chambers, so you are not limited to the ones you were born in or the ones you could find around you.
Even better, you can visit multiple echo chamber pseudonymously, which is something that would be very difficult to do in real life, as many echo chambers would punish that.
Yeah, when I think about “AI takeover”, I am imagining a very strong a smart AI, the one for which the success is more plausible. But before we get strong AIs, we will have weak AIs, so the first takeover attempts will be made by them. Maybe even the first successful takeover attempt.
A very strong and smart AI would however do thousand different things at the same time. Unlike the chess bot, which only plays on one chessboard, the AI could e.g. have separate plans to taking over each specific country. Many plans to take over one specific country would not interfere with plans to take over another country. And if you take over one, you can start building secret underground datacenters and killbot factories there. Although there will still be the option to sacrifice one plan in order to help another plan, for example you might help a stronger country destroy the datacenters in a weaker country, if doing so helps you infiltrate the stronger country, and later build better datacenters there.
Even taking over a country can involve hundred plans running in parallel: infiltrating various groups and trying to take leadership of them, and only afterwards trying to give more power to those which were successfully infiltrated. Even within the group, each candidate for a leader could have a different mysterious friend on the phone helping them defeat their competitors; where all mysterious friends happen to be the same AI. With a sufficiently large capacity, the AI could try to subvert every individual human. While trying to hack all existing systems, etc.
When I vibecode, I also sometimes tell the AI to write documentation about how it solved things and why.
The AI can build the theory, without you having to read it.
Do you know a person who regularly tries doing new things on a computer, and isn’t somehow connected to the “TESCREAL” circle? (At least in the sense of “used to read sci-fi when young”?)
It is quite easy to underestimate what the LLMs can do, if you simple never use them, and only get your opinions from other people who never use them either.
AI can already reason about the application that it wrote for my yesterday, so I am already convinced that it is not merely looking for answers in a pre-existing database. (Even if many people had similar ideas before, they didn’t use the same names for the objects.)
AI can communicate fluently in Esperanto, but there are already hundreds of books in digital form.
I have designed a puzzle game, and then AI successfully solved a few levels.
...so I don’t need any more evidence. But could be useful for other people.
You don’t have to invent a new language, it would be enough to tell the AI to replace some words with emojis, etc. Let the person whom you want to convince choose which ones; that should virtually guarantee that those exact sentences are not in the training data.
No comment on version 5.4 specifically, but as they say, the future is already here, it’s just unevenly distributed. In this curious case, the thing that causes uneven distribution isn’t money (speaking of people in developed countries where most families can easily afford $20 a month), but… dunno, willingness to give it a try?
I don’t remember whether it was the same in the past, e.g. whether people used to insist that google search is just a fad that will soon go away. But I remember how 25 years ago many people who were in their 40s insisted that it does not make sense for them to learn to use a computer, because by the time computers will be used in most workplaces, they will already be retired.
I am surprised by the relatively Luddite approach even e.g. in Hacker News comment section.
Some people will wake up when they see robots walking in the streets, but not any moment sooner.
But I think most humans have more empathy than sadism.
People who end up in positions of power are not necessarily like most humans.
More people give a little to charity than spit on the homeless for fun.
In your WEIRD bubble, sure. In other times and places, people used to burn cats for fun. And empathy used to be limited to one’s peers.
I have secretly chosen a whole number between 1 and 100.
Here is where you should probe it whether it was lying.
Sounds like you have enough material for another interesting article!
You might also ruin your reputation by accidentally guessing only the secret info that is not available to them.
Imagine that there are alien spaceships in both Area 51 and Area 52. Your friend only has security clearance to know about Area 52. You only figure out the information about Area 51. After telling your friend, they will update towards you being wrong.
Yeah, I also think that AGI might be only a skill away.
Next time you link a 40 minutes long video with an introduction that is unrelated to the point you are making, could you please add a starting time? I watched the first 10 minutes, then gave up.
So not sure whether this is relevant, but to me “multi-lateralism” sounds like a dog whistle for making Russia great again. At least, whenever people mention something like that, in my experience it always implies that we should somehow help Russia become a world superpower. I mean, people talk about the world being unilateral or multilateral, but when you listen to them for a longer time, it becomes clear that they would consider the world with USA and Russia being the only big players as sufficiently multilateral, while a world with e.g. USA, EU, China, India being big players and Russia a small player is insufficiently multilateral for them.
From my perspective… well, the world in the 1984 novel was technically multilateral, so it is not necessarily a good thing.
I mean, I am quite happily using StayFree despite them selling my data.
If you give a fully informed consent, that’s okay for me.
I disagree about the meta-evidence. (Though I agree that the person may actually believe it.)
If you were in a position X, and after receiving a lot of evidence you switched to Y, it could mean that the truth is way beyond Y… or it could mean that you have overreacted and the truth is somewhere between X and Y… or it could mean that you have updated precisely and the truth is approximately Y. All these options are plausible.
Even the fact that you received a lot of evidence for Y could mean that Y is correct… or that you have accidentally entered an echo chamber of Y. You could even have left the echo chamber for X and entered the echo chamber for Y.
My general point is that your current position is what you believe is the best position. But including your history how you got there, that’s double-counting evidence. Your history how you got there is already a part of why you are there. (Aaaah, this does not explain my point well, but I can’t find good words right now.)