Oleg Trott

Karma: 175

Columbia PhD, co-winner of the most well-funded ML competition ever, creator of the most cited molecular docking program: olegtrott.com

Oleg Trott Aug 18, 2024, 2:17 PM
3 points
0
in reply to: faul_sname’s comment on: How unusual is the fact that there is no AI monopoly?
“why didn’t the first person to come up with the idea of using computers to predict the next element in a sequence patent that idea, in full generality”
Patents are valid for about 20 years. But Bengio et al used NNs to predict the next word back in 2000:
https://papers.nips.cc/paper_files/paper/2000/file/728f206c2a01bf572b5940d7d9a8fa4c-Paper.pdf
So this idea is old. Only some specific architectural aspects are new.

Oleg Trott Aug 11, 2024, 6:58 AM
1 point
0
in reply to: Brendan Long’s comment on: Does VETLM solve AI superalignment?
I suspect this labeling and using the labels is still harder that you think though, since individual tokens don’t have truth values.
Why should they?
You could label each paragraph, for example. Then, when the LM is trained, the correct label could come before each paragraph, as a special token: <true>, <false>, <unknown> and perhaps <mixed>.
Then, during generation, you’d feed it <true> as part of the prompt, and when it generates paragraph breaks.
Similarly, you could do this on a per-sentence basis.

Oleg Trott Aug 9, 2024, 7:43 PM
3 points
0
in reply to: Brendan Long’s comment on: Does VETLM solve AI superalignment?
The idea that we’re going to produce a similar amount of perfectly labeled data doesn’t seem plausible.
That’s not at all the idea. Allow me to quote myself:
Here’s what I think we could do. Internet text is vast – on the order of a trillion words. But we could label some of it as “true” and “false”. The rest will be “unknown”.
You must have missed the words “some of” in it. I’m not suggesting labeling all of the text, or even a large fraction of it. Just enough to teach the model the concept of right and wrong.
It shouldn’t take long, especially since I’m assuming a human-level ML algorithm here, that is, one with data efficiency comparable to that of humans.

Oleg Trott Aug 8, 2024, 9:41 PM
−1 points
0
on: Does VETLM solve AI superalignment?
Carlson’s interview, BTW. It discusses LessWrong in the first half of the video. Between X and YouTube, the interview got 4M views—possibly the most high-profile exposure of this site?
I’m kind of curious about the factual accuracy: “debugging” / struggle sessions, polycules, and the 2017 psychosis—Did that happen?

Oleg Trott Aug 8, 2024, 9:01 PM
1 point
0
in reply to: johnswentworth’s comment on: Does VETLM solve AI superalignment?
What do VELM and VETLM offer which those other implementable proposals don’t? And what problems do VELM and VETLM not solve?
VETLM solves superalignment, I believe. It’s implementable (unlike CEV), and it should not be susceptible to wireheading (unlike RLHF, instruction following, etc) Most importantly, it’s intended to work with an arbitrarily good ML algorithm—the stronger the better.
So, will it self-improve, self-replace, escape, let you turn it off, etc.? Yes, if it thinks that this is what its creators would have wanted.
Will it be transparent? To the point where it can self-introspect and, again if it thinks that being transparent is what its creators would have wanted. If it thinks that this is a worthy goal to pursue, it will self-replace with increasingly transparent and introspective systems.

Oleg Trott Aug 8, 2024, 7:35 PM
1 point
0
in reply to: johnswentworth’s comment on: Does VETLM solve AI superalignment?
New proposals are useful mainly insofar as they overcome some subset of barriers which stopped other solutions.
CEV was stopped by being unimplementable, and possibly divergent:
The main problems with CEV include, firstly, the great difficulty of implementing such a program—“If one attempted to write an ordinary computer program using ordinary computer programming skills, the task would be a thousand lightyears beyond hopeless.” Secondly, the possibility that human values may not converge. Yudkowsky considered CEV obsolete almost immediately after its publication in 2004.
VELM and VETLM are easily implementable (on top of a superior ML algorithm). So does this fit the bill?

[Question] Does VETLM solve AI superalignment?

Oleg TrottAug 8, 2024, 6:22 PM

−1 points

10 comments1 min readLW link

Oleg Trott Jul 29, 2024, 7:34 PM
5 points
0
in reply to: harfe’s comment on: New Blog Post Against AI Doom
That post was completely ignored here: 0 comments and 0 upvotes during the first 24 hours.
I don’t know if it’s the timing or the content.
On HN, which is where I saw it, it was ranked #1 briefly, as I recall. But then it got “flagged”, apparently.

Oleg Trott Jul 29, 2024, 4:02 PM
5 points
4
on: AI existential risk probabilities are too unreliable to inform policy
Machine Learning Street Talk interview of one of the authors:

AI existential risk probabilities are too unreliable to inform policy

Oleg TrottJul 28, 2024, 12:59 AM

18 points

5 comments1 min readLW link

(www.aisnakeoil.com)

Oleg Trott Jul 21, 2024, 9:22 PM
1 point
0
on: The Assassination of Trump’s Ear is Evidence for Time-Travel
There was an article in New Scientist recently about “sending particles back in time”. I was a physics major, but I might have skipped the time travel class, so I don’t have an opinion on this. But Sabine Hossenfelder posted a video, arguing that New Scientist misrepresented the actual research.

Oleg Trott Jul 21, 2024, 8:42 PM
1 point
0
on: The $100B plan with “70% risk of killing us all” w Stephen Fry [video]
Side note: the link didn’t make it to the front page of HN, despite early upvotes. Other links with worse stats (votes at a certain age) rose to the very top. Anyways, it’s currently ranked 78. I guess I don’t really understand how HN ranks things. I hope someone will explain this to me. Does the source “youtube” vs “nytimes” matter? Do flag-votes count as silent mega-downvotes? Does the algorithm punish posts with numbers in them?

The $100B plan with “70% risk of killing us all” w Stephen Fry [video]

Oleg TrottJul 21, 2024, 8:06 PM

35 points

8 comments1 min readLW link

(www.youtube.com)

Oleg Trott Jul 19, 2024, 6:01 AM
1 point
0
in reply to: Nathan Helm-Burger’s comment on: Recursion in AI is scary. But let’s talk solutions.
Thanks! It looks interesting. Although I think it’s different from what I was talking about.

Oleg Trott Jul 17, 2024, 9:28 PM
1 point
0
in reply to: Nathan Helm-Burger’s comment on: Recursion in AI is scary. But let’s talk solutions.
I think your idea of labelling the source and epistemic status of all training data is good. I’ve seen the idea presented before.
I’m not finding anything. Do you recall the authors? Presented at a conference? Year perhaps? Specific keywords? (I tried the obvious)

Oleg Trott Jul 17, 2024, 6:51 PM
1 point
0
in reply to: RogerDearnaley’s comment on: Recursion in AI is scary. But let’s talk solutions.
I think that regularization in RL is normally used to get more rewards (out-of-sample).
Sure, you can increase it further and do the opposite – subvert the goal of RL (and prevent wireheading).
But wireheading is not an instability, local optimum, or overfitting. It is in fact the optimal policy, if some of your actions let you choose maximum rewards.
Anyway, the quote you are referring to says “as (AI) becomes smarter and more powerful”.
It doesn’t say that every RL algorithm will wirehead (find the optimal policy), but that an ASI-level one will. I have no mathematical proof of this, since these are fuzzy concepts. I edited the original text to make it less controversial.

Oleg Trott Jul 17, 2024, 2:37 AM
3 points
2
in reply to: RogerDearnaley’s comment on: Recursion in AI is scary. But let’s talk solutions.
Most humans are aware of the possibility of wireheading, both the actual wire version and the more practical versions involving psychotropic drugs.
For humans, there are negative rewards for abusing drugs/alcohol—hangover the next day, health issues, etc. You could argue that they are taking those into account.
But for an entirely RL-driven AI, wireheading has no anticipated downsides.

Oleg Trott Jul 16, 2024, 9:43 PM
1 point
0
in reply to: Nathan Helm-Burger’s comment on: Recursion in AI is scary. But let’s talk solutions.
Yes, it’s simple enough, that I imagine it’s likely people came up with it before. But it fixes a flaw in the other idea (which is also simple, although in the previous discussion I was told that it might be novel)

Recursion in AI is scary. But let’s talk solutions.

Oleg TrottJul 16, 2024, 8:34 PM

3 points

10 comments2 min readLW link

Oleg Trott Jul 13, 2024, 7:56 PM
1 point
0
in reply to: JuliaHP’s comment on: Alignment: “Do what I would have wanted you to do”
many of which will allow for satisfaction, while still allowing the AI to kill everyone.
This post is just about alignment of AGI’s behavior with its creator’s intentions, which is what Yoshua Bengio was talking about.
If you wanted to constrain it further, you’d say that in the prompt. But I feel that rigid constraints are probably unhelpful, the way The Three Laws of Robotics are. For example, anyone could threaten suicide and force the AGI to do absolutely anything short of killing other people.

Oleg Trott

[Question] Does VETLM solve AI su­per­al­ign­ment?

AI ex­is­ten­tial risk prob­a­bil­ities are too un­re­li­able to in­form policy

The $100B plan with “70% risk of kil­ling us all” w Stephen Fry [video]

Re­cur­sion in AI is scary. But let’s talk solu­tions.

[Question] Does VETLM solve AI superalignment?

AI existential risk probabilities are too unreliable to inform policy

The $100B plan with “70% risk of killing us all” w Stephen Fry [video]

Recursion in AI is scary. But let’s talk solutions.