You might like this video (from 9:18) by Isaac Arthur that talks about that
P.
To the extent that rationality has a purpose, I would argue that it is to do what it takes to achieve our goals, if that includes creating “propaganda”, so be it. And the rules explicitly ask for submissions not to be deceiving, so if we use them to convince people it will be a pure epistemic gain.
Edit: If you are going to downvote this, at least argue why. I think that if this works like they expect, it truly is a net positive.
To executives and researchers:
However far you think we are from AGI, do you think that aligning it with human values will be any easier? For intelligence we at least have formalisms (like AIXI) that tell us in principle how to achieve goals in arbitrary environments. For human values on the other hand, we have no such thing. If we don’t seriously start working on that now (and we can, with current systems or theoretical models), there is no chance of solving the problem in time when we near AGI, and the default outcome of that will be very bad, to say the least.
Edit: typos
I have a few questions:
Do we need to study at a US university in order to participate? I’m in Europe.
Who should be the target audience for the posts? CS students? The average LW reader? People somewhat interested in AI alignment? The average Joe? How much do we need to dumb it down?
Can we publish the posts before the contest ends?
Will you necessarily post the winners’ names? Can we go by a pseudonym instead?
How close to the source material should we stay? I might write a post about what value learning is, why it seems like the most promising approach and why it might be solvable, which would involve explaining a few of John Wentworth’s posts. But I don’t think my reasoning is exactly the same as his.
Also, is there any post that whoever is reading this comment tried and failed to understand? Or better yet, tried hard to understand but found completely impenetrable? If so, what part did you find confusing? If I choose to participate and try to explain that post, would you volunteer to read a draft to check that I’m explaining it clearly?
How do you propose to have many votes every day? Voting that often (on laws or representatives) would probably significantly lower the quality of the votes. It would also force most people to delegate to others that do it full time; people that don’t have enough time to vote and don’t want to delegate won’t be happy with that. And what infrastructure would you use? The internet? That doesn’t seem safe.
I mean, just in case I wasn’t clear enough, you want a program that takes in a representation of some system and outputs something a human can understand, right? But even if you could automatically divide a system into a tree of submodules such that a human could in principle describe how any one works in terms of short descriptions of the function of its submodules, there is no obvious way of automatically computing those descriptions. So if you gave a circuit diagram of a CPU as the input to that universal translator, what do you want it to output?
You replied to the wrong comment.
Thank you!
My mom wants to see what this would look like: “An ant on a scooter on an interplanetary flight”. Let’s see how many concepts it can combine at once.
What would such a representation look like for a computer? There might exist some method for computing how the circuits are divided into modules and submodules, but how would you represent what they do? You don’t expect it to be annotated in natural language, do you?
It seems they edited the paper, where now there is a close up of a handpalm with leaves growing from it, there once was a super saiyan sentient bag of potato chips:
We shouldn’t be surprised by the quality of the images, but since this will become a commercial product and art is something that is stereotypically hard for computers, I wonder if for the general public (including world governments) this will be what finally makes them realize that AGI is coming. OpenAI could have at least refrained from publishing the paper, it wouldn’t have made any difference but would have been a nice symbolic gesture.
DALL·E 2 by OpenAI
ELK itself seems like a potentially important problem to solve, the part that didn’t make much sense to me was what they plan to do with the solution, their idea based on recursive delegation.
I will probably spend 4 days (from the 14th to the 17th, I’m somewhat busy until then) thinking about alignment to see whether there is any chance I might be able to make progress. I have read what is recommended as a starting point on the alignment forum, and can read the AGI Safety Fundamentals Course’s curriculum on my own. I will probably start by thinking about how to formalize (and compute) something similar to what we call human values, since that seems to be the core of the problem, and then turning that into something that can be evaluated over possible trajectories of the AI’s world model (or over something like reasoning chains or whatever, I don’t know). I hadn’t considered that as a career, I live in Europe and we don’t have that kind of organizations here, so it will probably just be a hobby.
Neural style transfer algorithms that rely on optimizing an image by gradient descent are painfully slow. You can make them faster by training a network to map the original image to a stylized one that minimizes the loss. I made them even faster by stacking a few Perlin noise channels to the input, using a very fast hand-written edge detection algorithm and stacking that too and then performing only local processing. Unfortunately this gives very poor results and is only not uselessly slow for low-res inputs.
So I gave up on that and started reading about non-ML style transfer and texture synthesis algorithms, it’s a fascinating topic. In the end I managed to create an algorithm that can perform real time poor high resolution style transfer, by using numba to compile python code. Sadly when I tried to turn that into an Android app, despite trying really hard, I found that there was no way to run numba on Android, and using numpy directly isn’t fast enough, so I gave up on that too.
I think going to congress would be counterproductive and would just convince them to create an AGI before their enemies do.
For whatever it is worth, this post along with reading the unworkable alignment strategy on the ELK report has made me realize that we actually have no idea what to do and has finally convinced me to try to solve alignment, I encourage everyone else to do the same. For some people knowing that the world is doomed by default and that we can’t just expect the experts to save it is motivating. If that was his goal, he achieved it.
Oh, and another bad faith strategy consists on writing many ok comments instead of just a few good ones, the Good Heart Tokens a user receives aren’t actually proportional to the total value added. Please keep the score of this one at 1.
Task: Computing the relevance of a paper to solving a problem
Context: A researcher is looking at existing literature trying to solve some problem. However, it is not obvious what to search for. There might exist research whose relevance can only be determined once they read it and think about it. For this task a simple keyword-matching search engine isn’t useful.
Input type: A problem statement and an URL to a website or PDF. Optionally, text with reasoning about the problem and what has been tried before. The URLs in the training set can be papers that have been found useful in the past, but ideally they would be found by interacting with the system. When queried it evaluates as many pages in its crawling corpus as possible, perhaps only the top k (for a very big k) results given by a regular search engine.
Output type: An explanation (maybe optional, considering most pages are irrelevant) of why the document might or might not be useful for solving the problem, in addition to a numerical score. The score can be given in some scale of usefulness following some consistent guidelines. But ideally it would be the percentile of usefulness assuming it falls among the n pages the researcher can look at when interacting with the system.
Info constraints: If training with documents that have been useful in the past, the program can’t be aware of any work that has been the result of that.
Instances: (they should be collected by interacting with the system, the usefulness percentiles are made up)
Instance 1:
Input:
Problem: Ontological translation. Given two different ontologies about the same domain and some representation in one ontology, compute the corresponding representation (if any) on the other ontology.
URL: Ontological Crisis in Artificial Agents’ Value Systems
Output:
Reasoning: This paper deals with translating between an AI’s old wrong ontology, represented as a POMDP-like system and a new bigger and better ontology. It presents a simple heuristic algorithm for doing so based on bisimulation. It has shortcomings, such as using monolithic states or being plausibly susceptible to steganography, but it seems like a promising starting point.
Predicted usefulness percentile: 80%
Instance 2:
Input:
Problem: Ontological translation.
URL: An ontological approach to translation and untranslatability
Output:
Reasoning: This is a paper about linguistics that deals with language translation. It contains no insights about general ontological translation.
Predicted usefulness percentile: <1%
Instance 3:
Input:
Problem: Ontological translation.
URL: Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks
Output:
Reasoning: This paper describes a method for automatically computing an image in some distribution corresponding to a given image in another distribution, say: photos of apples to photos of oranges or zebras to horses. Cycle consistency might be useful in general. However, the loss:
Doesn’t guarantee that we will actually end up with the corresponding element, only something that seems to be from that distribution. While it works in this case, it is probably because doing it correctly happens to be simple, there is no guarantee that it will generalize to more complex situations.
Seems unusually susceptible to steganography.
Has the regular problems associated with GANs.
Predicted usefulness percentile: 63%
Instance 4:
Input:
Problem: Ontological translation.
URL: Unsupervised Machine Translation Using Monolingual Corpora Only
Output:
Reasoning: This paper presents an algorithm for translation sentences between two languages in an unsupervised fashion. It is based on encoding sentences of both languages in the same latent space, forcing those two distributions to be indistinguishable using an adversarial loss and useful for performing two different tasks. While it works for this use case, it uses text translation specific tricks such as initializing the translator to word-by-word translation obtained using another method or adding “noise” by dropping and swapping words. It also doesn’t seem like it could scale well, especially if the sizes of the ontologies are very different.
Predicted usefulness percentile: 45%
Simply searching for “ontological translation” on Google Scholar gives terrible results.
I might add more and better instances later.