↘↘↘↘↘↘↙↙↙↙↙↙
Checkout my Biography.
↗↗↗↗↗↗↖↖↖↖↖↖
Johannes C. Mayer
Detangle Communicative Writing and Research
One reason why I never finish any blog post is probably because I’m just immediately starting to write it. I think it is better to first build a very good understanding of whatever I’m trying to understand. Only when I’m sure I have understood do I start to create a very narrowly scoped writeup?
Doing this has two advantages. First, it speeds up the research process, because writing down all your thoughts is slow.
Second, it speeds up the writing of the final document. You are not confused about the thing, and you can focus on what is the best way to communicate it. This reduces how much editing you need to do. You also scope yourself by only writing up the most important things.
See also here more concrete steps on how to do this.
In principle, you could use Whisper or any other ASR system with high accuracy to enforce something like this during a live conversation.
Be Confident in your Processes
I thought a lot about what kinds of things make sense for me to do to solve AI alignment. That did not make me confident that any particular narrow idea that I have will eventually lead to something important.
Rather, I’m confident that executing my research process will over time lead to something good. The research process is:
Take some vague intuitions
Iteratively unroll them into something concrete
Update my models based on new observations I make during this overall process.
I think being confident, i.e. not feeling hopeless in doing anything, is important. The important takeaway here is that you don’t need to be confident in any particular idea that you come up with. Instead, you can be confident in the broader picture of what you are doing, i.e. your processes.
The best way to become confident in this way is to just work a bunch and then reflect back. It is very likely that you will be able to see how improved. And probably you will have had a few successes.
Ok, I was confused before. I think Homoiconicity is sort of several things. Here are some examples:
In basically any programming language L, you can have program A, that can write a file containing a valid L source code that is then run by A.
In some sense, python is homoiconic, because you can have a string and then exec it. Before you exec (or in between execs) you can manipulate the string with normal string manipulation.
In R you have the quote operator which allows you to take in code and return and object that represents this code, that can be manipulated.
In Lisp when you write an S-expression, the same S-expression can be interpreted as a program or a list. It is actually always a (possibly nested) list. If we interpret the list as a program, we say that the first element in the list is the symbol of the function, and the remaining entries in the list are the arguments to the function.
Although I can’t put my finger on it exactly, to me it feels like the homoiconicity is increasing in further down examples in the list.
The basic idea though seems to always be that we have a program that can manipulate the representation of another program. This is actually more general than homoiconicity, as we could have a Python program manipulating Haskell code for example. It seems that the further we go down the list, the easier it gets to do this kind of program manipulation.
I am also not sure how useful it is, but I would be very careful with saying that R programmers not using it is strong evidence that it is not that useful. Basically, that was a bit the point I wanted to make with the original comment. Homoiconicity might be hard to learn and use compared to learning a for loop in python. That might be the reason that people don’t learn it. Because they don’t understand how it could be useful. Probably actually most R users did not even hear about homoiconicity. And if they would they would ask “Well I don’t know how this is useful”. But again that does not mean that it is not useful.
Probably many people at least vaguely know the concept of a pure function. But probably most don’t actually use it in situations where it would be advantageous to use pure functions because they can’t identify these situations.
Probably they don’t even understand basic arguments, because they’ve never heard them, of why one would care about making functions pure. With your line of argument, we would now be able to conclude that pure functions are clearly not very useful in practice. Which I think is, at minimum, an overstatement. Clearly, they can be useful. My current model says that they are actually very useful.
[Edit:] Also R is not homoiconic lol. At least not in a strong sense like lisp. At least what this guy on github says. Also, I would guess this is correct from remembering how R looks, and looking at a few code samples now. In LISP your program is a bunch of lists. In R not. What is the data structure instance that is equivalent to this expression:
%sumx2y2% <- function(e1, e2) {e1 ^ 2 + e2 ^ 2}
?
The Science Algorithm—AISC 2024 Final Presentation
Adopted.
[Concept Dependency] Edge Regular Lattice Graph
[Concept Dependency] Concept Dependency Posts
A few adjacent thoughts:
Haskell is powerful in the sense that when your program compiles, you get the program that you actually want a much higher probability compared to most other languages. Many stupid mistakes that are runtime errors in other languages are now compile-time errors. Why is almost nobody using Haskell?
Why is there basically no widely used homoiconic language, i.e. a language in which you can use the language itself directly to manipulate programs written in the language.
Here we have some technologies that are basically ready to use (Haskell or Clojure), but people decide to mostly not use them. And with people, I mean professional programmers and companions who make software.
Why did nobody invent Rust earlier, by which I mean a system-level programming language that prevents you from making really dumb mistakes by having the computer check whether you made them?
Why did it take like 40 years to get a latex replacement, even though latex is terrible in very obvious ways?
These things have in common that there is a big engineering challenge. It feels like maybe this explains it, together with that people who would benefit from these technologies where in the position that the cost of creating them would have exceeded the benefit that they would expect from them.
For Haskell and Clojure we can also consider this point. Certainly, these two technologies have their flaws and could be improved. But then again we would have a massive engineering challenge.
Research Writing Workflow: First figure stuff out
Do research and first figure stuff out, until you feel like you are not confused anymore.
Explain it to a person, or a camera, or ideally to a person and a camera.
If there are any hiccups expand your understanding.
Ideally, as the last step, explain it to somebody whom you have not ever explained it to.
Only once you made a presentation without hiccups you are ready to write post.
If you have a recording this is useful as a starting point.
- 1 May 2024 10:18 UTC; 1 point) 's comment on Johannes C. Mayer’s Shortform by (
The point is that you are just given some graph. This graph is expected to have subgraphs which are lattice graphs. But you don’t know where they are. And the graph is so big that you can’t iterate the entire graph to find these lattices. Therefore you need a way to embed the graph without traversing it fully.
—The realization that I have a systematic distortion in my mental evaluation of plans, making actions seem less promising than they are. When I’m deciding whether to do stuff, I can apply a conscious correction to this, to arrive at a properly calibrated judgment.
—The realization that, in general, my thinking can have systematic distortions, and that I shouldn’t believe everything I think. This is basic less-wrong style rationalism, but it took years to work through all the actual consequences on me.
This is useful. Now that I think about it, I do this. Specifically, I have extremely unrealistic assumptions about how much I can do, such that these are impossible to accomplish. And then I feel bad for not accomplishing the thing.
I haven’t tried to be mindful of that. The problem is that this is I think mainly subconscious. I don’t think things like “I am dumb” or “I am a failure” basically at all. At least not in explicit language. I might have accidentally suppressed these and thought I had now succeeded in not being harsh to myself. But maybe I only moved it to the subconscious level where it is harder to debug.
I might not understand exactly what you are saying. Are you saying that the problem is easy when you have a function that gives you the coordinates of an arbitrary node? Isn’t that exactly the embedding function? So are you not therefore assuming that you have an embedding function?
I agree that once you have such a function the problem is easy, but I am confused about how you are getting that function in the first place. If you are not given it, then I don’t think it is super easy to get.
In the OP I was assuming that I have that function, but I was saying that this is not a valid assumption in general. You can imagine you are just given a set of vertices and edges. Now you want to compute the embedding such that you can do the vector planning described in the article.
I agree that you probably can do better than though. I don’t understand how your proposal helps though.
Yes right, good point. There are plans that go zick-sag through the graph, which would be longer. I edited that.
Yes, abstraction is the right thing to think about. That is the context in which I was considering this computation. In this post I describe a sort of planning abstraction that you can do if you have an extremely regular environment. It does not yet talk about how to store this environment, but you are right that this can of course also be done similarly efficiently.
In this post, I describe a toy setup, where I have a graph of vertices. I would like to compute for any two vertices A and B how to get from A to B, i.e. compute a path from A to B.
The point is that if we have a very special graph structure we can do this very efficiently. O(n) where n is the plan length.
Vector Planning in a Lattice Graph
Can you iterate through 10^100 objects?
If you have a 1GHz CPU you can do 1,000,000,000 operations per second. Let’s assume that iterating through one one object takes only one operation.
In a year you can do 10^16 operations. That means it would take 10^84 years to iterate through 10^100 verticies.
The big bang was 1.4*10^10 years ago.
Seems pretty good to me to have this in a video call to me. The main reason why don’t immediately try this out is that I would need to write a program to do this.