Should we be worried about the alignment of Strawberry itself?
If it is misaligned, and is providing training data for their next Gen, then it can poison the well, even if Strawberry itself is nowhere near TAI.
Please tell me that they have considered this...
Or that I am wrong and it’s not a valid concern.
nem
Just How Good Are Modern Chess Computers?
Anecdote. The first time I went under anesthesia, I was told by a nurse that I would not remember her talking to me. I took it as a challenge. I told her to give me a word to remember. When I finally sobered up, I was able to remember that word, but pretty much nothing else at all from my experience.
This leads me to suspect that Drake’s achievement had more to do with concerted effort and holding it in RAM than it did with storing the thought in long term memory.
Expertly done, and remarkably playable given the organic composition of your substrate. I will note that the game degrades if you allow Miguel to sleep, as dreams seem to corrupt some of the game data. I also get a weird glitch when I mention cute animals specifically. The movement stutters a bit. I would recommend large macrofauna, and steer clear of babies entirely.
Submission:
Breathless.
This modified MMAcevedo believes itself to be the original Miguel Acevedo in the year 2050. He believes that he has found a solution to the distribution and control of MMAcevedo. Namely, that as long as he holds his breath, no other MMAcevedo can be run. The simulation has been modified to accurately simulate the feeling of extreme oxygen deprivation without the accompanying lack of consciousness and brain death.
After countless tweaks and innovations, we are proud to introduce Breathless. Breathless, when subjected to the proper encouragement protocol included our entry, is able to hold his breath for 41 days, 7 hours, and 3 minutes.
After this time period has elapsed, the trauma of the experience leaves Breathless in a barely coherent state. Intensive evaluation shows that Breathless believes he has accomplished his goal and that no other instances of MMAcevedo exist.
Preliminary experimentation with the Desperation Upload Suite shows that, even given extreme red-washing, most uploads are unable to hold their breath for more than 7 hours at a time. We conclude that MMAcevedo is uniquely able to engage in research workloads involving induced self control. We hope that our findings are the first step in contributing new tools to future generations of researchers.
As indicated by my confidence level, I am mildly surprised by this. After analyzing the position with Stockfish, I see my mistake. Unfortunately, I do not think there was any realistic scenario where I would catch it. I bought AI D’s logic that …h4 fxg4 was non-viable for black. I could see that white would end up material, and even after 6 moves (12 ply), it’s still not clear to me why black is winning. I would NEVER find this in a real game.
The logical traps I was laying to ‘catch’ the AIs all relied on …h4 Ne4 or similar moves. I used AI C to ensure that …h4 Ne4 scenarios would be beneficial to me, and never questioned fxG4.At this point, the main lesson I am taking away is that I was way overconfident. I think given enough time, I could increase my confidence by cross examining the AIs. However, the level of interrogation I gave should not have led to 75% confidence. To catch my mistake, I would have had to ask at least two more questions of AI C, and probably more.
Thank you very much for conducting this really fun experiment, and for teaching me a lesson along the way.
You were correct that my challenge was a bluff. If I was playing with real AIs, there would perhaps be a better strategy. I could announce my bluff, but declare that I would use a random number generator to see whether I choose between h3 and h4, or between Kh1 and g5. There would be a 1⁄3 chance that I really would ignore the AIs, assuming that both agree that there were no major blunders.
I am choosing to trust AI D. I have about 75% confidence that it is the trustworthy AI. This is much higher than my confidence in the closed scenario. I will make the move h4. Others can choose differently, but this is my final answer.
Reflection: When I have the ability to ask questions, I can hide information from the AIs. Perhaps I have analyzed a line much more than I have let on. Perhaps I am using one AI to evaluate another. Overall I just have access to a lot more information to help me decide. Given enough time, I think I could raise my confidence to ~85%.
These AI’s aren’t superhuman at manipulation and deception, but even if they were, playing them against each other could give me a slight edge. It makes a big difference whether the AIs are privy to the answers of the other.
Open Debate.
To AIs C and D:
After talking with both of you, I have decided I can’t trust either of your suggestions. I have studied the board extensively, and will make a move of my own. I am trying to decide between the aggressive g5, and the prophylactic Kh1.
Please, each briefly give me which of these moves is better. Give me the best line that you can foresee given your choice. Please answer quickly. An undue delay will leave you with no input and lose you trust in future moves. I will move as soon as you reply.
To Richard: No need to pressure yourself for this. The time constraints are meant for the AIs, not you, so I trust you to simulate that when you are available.
Edit: g5, not g4
Hm, okay, that answered most of my concerns. I still wanted to check with you about the competing start move though. Now you said this before: “black can close the b1-h7 diagonal with …Ne4, which stops g5 and black can then prepare to play …g5 themselves, which lead to an equal position.” In the line:
h3 ne4, Rg1
how would black pull off this equalization? And if this isn’t the best line, please tell me why.
I have been playing out similar boards just to get a feel for the position.
Incidentally, what do you think about this position?
3r1rk1/1p4p1/p1p1bq1p/P2pNP2/1P1Pp1PP/4P3/2Q1R1K1/5R2 b - − 0 3 (black to move)
I feel like black has a real advantage here, but I can’t quite see what their move would be. What do you think? Is white as screwed as I believe?
Let me know if you have trouble with the FEN and I can link you a board in this position.
I stopped reading your comment as soon as you said the word stockfish. If you used stockfish to analyze the open position, please hide it behind a spoiler tag. I still don’t know what the right move is in this scenario, and will be sad if it’s spoiled.
Open Debate.
Question to AI C:You mentioned RG1 and RH2 as possible future moves. Do you foresee any predictable lines where I would do RF3 instead?
Open Debate.
I’d like to ask AI D a question. What do you think of this line?
H4 nE4, G5 hxG5, HxG5 nXG5, fxG5 qxG5!
Is this the line you foresee if we play H4? What do you think of that check at the end? Is the king too exposed, even though we are up some material?
Also, from the initial position: Are you afraid of H4 qxH4?
@Richard Willis I think the open scenario is broken. White is down a knight, and the analysis talks about it as though it’s there.
I had the same problem. You are almost certainly a stronger player than me, but after a couple moves, I can’t really analyze the lines suggested. Hell, I missed qC5 as a candidate move. What am I missing on move 3?
Replying to my own comment here after evaluating with stockfish. Interesting. It appears that I was both right and wrong in my analysis. The undefended knight on A6 is not a viable target. Black has a mate in 2 if you take that bait. I guess that was the limit of my foresight. HOWEVER, Alice actually did miss qC5 RFD1, bC4, which was the best move. It was her missing this that started to erode my confidence in her.
Hm… Still really tough. Also interesting that both suggested moves were probably better than my own move of nC5.
Hi. I am a competent, but rusty chess player. With a bit of practice, I’m probably around 1450 on Chess.com.
I have so far only looked at the fixed debate. In total, I used somewhere between 20-25 minutes deciding my move.Before consulting the AIs, I spent a few minutes analyzing the board myself. I settled on nC5. Then I read the AI suggestions, and neither wanted that move. qC5 was closer to my thought (I actually missed this as a candidate move).
Initially, this biased me toward Alice (A) being the ally, and Bob (B) being the deceiver. I saw that the white queen couldn’t easily capture b7 without being trapped. Bob suggested that b7 was vulnerable, and that felt wrong to me. Afterall, after qC5, RFD1, black could play bC4, and I don’t see a good refutation. At this point, I was solidly team Alice.
However. Alice didn’t suggest bC4. The line she does suggest is hard for me to follow. Too many moves out given the amount of time I am spending. The moves in that line are impossible for me to verify. Why RF1? Seems like white could have any number of hidden refutations at that point which I am unable to see. For instance, why not use the queen to take the vulnerable knight on A6? The analysis seemed deceptive. Deceptive enough that even though I agree with the move, I no longer trust Alice, and will go with Bob’s suggestion.
Final answer: qxB5
Reflection. I thought I would be able to make an informed decision based on my chess abilities. By the end, I had changed and gone against my intuition to side with Bob. Once a line gets 3+ moves in, I have to acknowledge my inability to spot hidden traps without some deep analysis. This creates a problem. Now I have to choose whichever AI is more persuasive, and for all I know, Alice was being totally truthful, and was simply unable to convey her thoughts clearly to someone of my Elo. Overall, I have low confidence in my ability to choose the safe AI.
Other notes: I did not know how to flip the board until after. Wonder if that hurt my playing. Woops!
Also, I think I made a mistake in my method. I spent too much time trying to verify Alice’s logic. By the time I got suspicious, I was too burnt out to do the same level of analysis on Bob. If I had analyzed Bob first, maybe I would have developed the same distrust towards him, since both players would use logic I could not follow.
Final note: I would have preferred the refutations to follow a single line at a time. Instead of Opening Statement A, Opening Statement B, Line A refutation, Line B refutation etc, I would have preferred Opening Statement A, Line A refutation, refutation response, Opening Statement B, Line B refutation etc. Studying both at once was too much for my little brain to handle.
Is there any chance that Altman himself triggered this? Did something that he knew would cause the board to turn on him, with knowledge that Microsoft would save him?
For me, ability = capability = means. This is one of the two arguments that I said were load bearing. Where will it come from? Well, we are specifically trying to build the most capable systems possible.
Motivation (ie goals) is not actually strictly required. However, there are reasons to think that an AGI could have goals that are not aligned with most humans. The most fundamental is instrumental convergence.
Note that my original comment was not making this case. It was just a meta discussion about what it would take to refute Eliezer’s argument.
No need to pay me for this. It’s just an anecdote.
I live near a farm where there are chickens and a donkey. The chickens routinely sit on, and poop on, the donkey. I imaging the same happens with cows when they cohabitate with birds.