↘↘↘↘↘↘↙↙↙↙↙↙
Checkout my Biography.
↗↗↗↗↗↗↖↖↖↖↖↖
Johannes C. Mayer
My mind derives pleasure from deep philosophical and technical discussions.
In model flirting is about showing that you are paying attention. You say things that you could only pick up if you pay close attention to me and what I say. It’s like a cryptographic proof certificate, showing that you think that I am important enough to pay attention to continuously. Usually this is coupled with an optimization process of using that knowledge to make me feel good, e.g. given a compliment that actually tracks reality in a way I care about.
It’s more general than just showing sexual interest I think.
I don’t use it to write code, or really anything. Rather I find it useful to converse with it. My experience is also that half is wrong and that it makes many dumb mistakes. But doing the conversation is still extremely valuable, because GPT often makes me aware of existing ideas that I don’t know. Also like you say it can get many things right, and then later get them wrong. That getting right part is what’s useful to me. The part where I tell it to write all my code is just not a thing I do. Usually I just have it write snippets, and it seems pretty good at that.
Overall I am like “Look there are so many useful things that GPT tells me and helps me think about simply by having a conversation”. Then somebody else says “But look it get’s so many things wrong. Even quite basic things.” And I am like “Yes, but the useful things are still useful that overall it’s totally worth it.”
Maybe for your use case try codex.
Why don’t you run the test yourself seems very easy?
Yes it does catch me when I am saying wrong things quite often. It also quite often says things that are not correct and I correct it, and if I am right it usually agrees immediately.
Their romantic partner offering lots of value in other ways. I’m skeptical of this one because female partners are typically notoriously high maintenance in money, attention, and emotional labor. Sure, she might be great in a lot of ways, but it’s hard for that to add up enough to outweigh the usual costs.
Imagine a woman is a romantic relationship with somebody else. Are they still so great a person that you would still enjoy hanging out with them as a friend? If not that woman should not be your girlfriend. Friendship first. At least in my model romantic stuff should be stacked ontop of platonic love.
Don’t spend all your time compressing knowledge that’s not that useful to begin with, if there are higher value things to be learned.
An extreme test of how much you trust a persons intention is to consider whether you would upload them, if you could. Then they would presumably be the only (speed) superintelligence.
Maybe better name: Let me help debug your math via programming
If you’ve tried this earnestly 3 times, after the 3rd time, I think it’s fine to switch to just trying to solve the level however you want (i.e. moving your character around the screen, experimenting).
After you failed 3 times, wouldn’t it be a better exercise to just play around in the level until you get a new pice of information that you predict will allow you to reformulate better plans, and then step back into planning mode again?
Another one: We manage to solve alignment to a significant extend. The AI who is much smarter than a human thinks that it is aligned, and takes aligned actions. The AI even predicts that it will never become unaligned to humans. However, at some point in the future as the AI naturally unrolles into a reflectively stable equilibrium it becomes unaligned.
Why not AI? Is it that AI alignment is too hard? Or do you think it’s likely one would fall into the “try a bunch of random stuff” paradigm popular in AI, which wouldn’t help much in getting better at solving hard problems?
What do you think about the strategy of instead of learning a textbook e.g. on information theory, or compilers you try to write the textbook and only look at existing material if you are really stuck. That’s my primary learning strategy.
It’s very slow and I probably do it too much, but it allows me to train to solve hard problems that aren’t super hard. If you read all the text books all the practice problems remaining are very hard.
How about we meet, you do research, and I observe, and then try to subtly steer you, ideally such that you learn faster how to do it well. Basically do this, but without it being an interview.
What are some concrete examples of the of research that MIRI insufficiently engaged with? Are there general categories of prior research that you think are most underutilized by alignment researchers?
… and Carol’s thoughts run into a blank wall. In the first few seconds, she sees no toeholds, not even a starting point. And so she reflexively flinches away from that problem, and turns back to some easier problems.
I spend ~10 hours trying to teach people how to think. I sometimes try to intentionally cause this to happen. Usually you can recognize it by them starting to be quiet (I usually give the instruction that they should do all their thinking out loud). And this seems to be when actual cognitive labor is happening, instead of saying things that you already knew. Though usually they by default fail earlier than “realizing the hard parts of ELK”.
Usually I need to tell them that actually they are doing great by thinking about the black wall more, and shouldn’t now switch the topic.
Infact it seem to be a good general idea generation strategy to just write down all the easy ideas first, until you hit this wall, such that you can start to actually think.
Why Physicists are competent
Here is my current model after thinking about this for 30 minutes of why physicists are good at solving hard problems (not ever having studied physics extensively myself).
The job description of a physicist is basically “understand the world”, meaning make models that have predictive power over the real world.
This is very different from math. In some sense a lot harder. In math you know everything. There is no uncertainty. And you have a very good method to verify that you are correct. If you have generated a proof, it’s correct. It’s also different from computer science for similar reasons.
But of cause physicists need to be very skilled at math, because if you are not skilled at math you can’t make good models that have predictive power. Similarly physicists need to be good at computer science, to implement physicsal simulations, which often involve complex algorithms. And to be able to actually implement these algorithms such that they are fast enough, and run at all, they need to also be decent at software engeneering.
Also understanding the scientific method is a lot more important when you are physicist. It’s sort of not required to understand science for doing math and theoretical CS.
Another thing is that physicists need actually do things that work. You can do some random math that’s not useful at all. It seems harder to make a random model of reality that predicts some aspect of reality that you couldn’t predict before, and have you not figure out anything important. As a physicist you are actually measured by how reality is. You can’t go “hmm maybe this just doesn’t work” like in math. Obviously somehow it works because it’s reality, you just haven’t figured out how to properly capture how reality is in your model.
Perhaps this trains physicist to not give up on problems, because the default assumption is that clearly there must be some way to model some part of reality, because reality is in some sense already a model of itself.
I think this is the most important cognitive skill. Not giving up. I think this is much more important than any particular pice of technical knowledge. Having technical knowledge is of cause required, but it seems that if you where to not give up on thinking how to solve a problem (that is hard but important) would make you end up learning whatever is required.
And in some sense it is this simple. When I see people run into a wall, and then have them stare at a wall they often have ideas that I like so much that I feel the need to write them down.
I watched this video, and I semi trust this guy (more than anybody else) about not getting it completely wrong. So you can eat too much soy. But eating a bit is actually healthy, is my current model.
Here is also a calculation I did that it is possible to get all amino acids from soy without eating too much.
Haven’t thought about, nor experimented with that. If you think clams would be ok to eat, you could perform the experiment yourself.
At the 2024 LessWrong Community weekend I met somebody who I have been working with for perhaps 50 hours so far. They are better at certain programming related tasks than me, in a way provided utility. Before meeting them they where not even considering working on AI alignment related things. The conversation wen’t something like this:
Johannes: What are you working on.
Other Person: Web development. What are you working on?
Johannes: I am trying to understand intelligence such that we can build a system that is capable enough to prevent other misaligned AI’s from being build, and that we understand enough such that we can be sure that it wouldn’t kill us. [...] Why are you not working on it? Other Person: (I forgot what he said)
Johannes: Oh then now is the perfect time to start working on it.
Other Person: So what are you actually doing.
Johannes: (Describes some methodologies.)
Other Person: (Questions whether these methodologies are actually good, and thinks about how they could be better.)
[...]Actually this all happened after the event when traveling from the venue to the train station.
It doesn’t happen that often that I get something really good out of a random meeting. Most of them are bad. However, I think the most important thing I do to get something out is to just immediately talk about the things that I am interested in. This efficiently filters out people, either because they are not interested, or because they can’t talk it.
You can overdo this. Starting a conversation with “AI seems very powerful, I think it will likely destroy the world” can make other people feel awkward (I know from experience). However, the above formula of “what do you do” and then “and I do this” get’s to the point very quickly without inducing awkwardness.
Basically you can think of this as making random encounters (like walking back to the train station with randomly sampled people) non-random by always trying to steer any encounter such that it becomes useful.
I probably did it badly. I would eat hole grain bread pretty regularly, but not consistently. I might not eat it for 1 week in a row sometimes. That was before I knew that amino acids are important.
It was ferritin. However the levels where actually barely within acceptable levels. I hypothesise that because I started to eat steamed blood for perhaps 2 weaks prior every day, and that blood contains a lot of heme iron, that I was deficient before.
Mathematical Notation as Learnable Language
To utilize mathematical notation fully you need to interpret it. To read it fluently, you must map symbols to concrete lenses, e.g. computational, visual, algebraic, or descriptive.
Example: Bilinear Map
Let
f:R2×R2→R
be defined by
f((x1,x2),(y1,y2))=x1y1+2x2y2.
Interpretations:
Computational
Substitute specific vectors and check results. If v=(3,4), then
f((x1,x2),v)=3x1+8x2,
Through this symbolic computation we can see how the expression depends on x. Perform such computations until you get a feel for the “shape” of the functions behavior.
Visual
For each fixed v, the function u↦f(u,v) is represented by a hyperplane in R2×R. We can imagine walking on the hyperplane. This obviously always walks on a line, therefore it’s linear.
Symbolic manipulation
Verify algebraically:
f(au+bw,v)=(au1+bw1)v1+2(au2+bw2)v2=af(u,v)+bf(w,v).
This establishes linearity by direct algebraic manipulation. You understand what properties exist by showing them algebraically.
Descriptive
What it means to be a bilinear map is, that if you hold the second argument fixed, and vary the second, you have a linear function. Same if holding the first fixed and varying the second.
You want to capture the intuition in natural language.
Mathematical language is a language that you need to learn like any other. Often people get stuck by trying to use symbolic manipulation too much. Because mathematical language is so precise, it makes it easy to interpret it in many different while still being able to check if your interpretation captures the core.