Rossin

Karma: 167

Rossin Jun 27, 2022, 8:28 PM
3 points
0
on: Announcing the Inverse Scaling Prize ($250k Prize Pool)
Would the ability to deceive humans when specifically prompted to do so be considered an example? I would think that large LMs get better at devising false stories about the real world that people could not distinguish from true stories.

Rossin Jun 16, 2022, 7:51 PM
2 points
on: On A List of Lethalities
On the idea of “we can’t just choose not to build AGI”. It seems like much of the concern here is predicated on the idea that so many actors are not taking safety seriously, so someone will inevitably build AGI when the technology has advanced sufficiently.
I wonder if struggles with AIs that are strong enough to cause a disaster but not strong enough to win instantly may change this perception? I can imagine there being very little gap if any between those two types of AI if there is a hard takeoff, but to me it seems quite possible for there be some time at that stage. Some sort of small/moderate disaster with a less powerful AI might get all the relevant players to realize the danger. At that point, humans have done reasonably well at not doing things that seem very likely to destroy the world immediately (e.g. nuclear war).
Though we’ve been less good at putting good safeguards in place to prevent it from happening. And even if all groups that could create AI agree to stop, eventually someone will think they know how to do it. And we still only get the one chance.
All that is to say I don’t think it’s implausible that we’ll be able to coordinate well enough to buy more time, though it’s unclear whether that will do much to avoiding eventual doom.

Rossin Apr 19, 2022, 7:13 PM
7 points
on: Fixed points and free will
I feel like a lot of the angst about free will boils down to conflicting intuitions.
1. It seems like we live in a universe of cause and effect, thus all my actions/choices are caused by past events.
2. It feels like I get to actually make choices, so 1. obviously can’t be right.
The way to reconcile these intuitions is to recognize that yes, all the decisions you make are in a sense predetermined, but a lot of what is determining those decisions is who you are and what sort of thing you would do in a particular circumstance. You are making decisions, that experience is not invalidated by a fully deterministic universe. It’s just that you are who you are and you’ll make the decision that you would make.

Rossin Apr 13, 2022, 2:05 PM
2 points
in reply to: hold_my_fish’s comment on: Genetic Enhancement: a Strategy for Long(ish) AGI Timeline Worlds
That’s true, there was a huge amount of outrage even before those details came out however.

Rossin Apr 12, 2022, 11:42 PM
3 points
in reply to: gwern’s comment on: The Efficient LessWrong Hypothesis—Stock Investing Competition
I of course don’t have insider information. My stance is something close to Buffett’s advice “be fearful when others are greedy, and greedy when others are fearful”. I interpret that as basically that markets tend to be overly reactionary and if you go by fundamentals representing the value of a stock you can potentially outperform the market in the long run. To your questions, yes disaster may really occur, but my opinion is that these risks are not sufficient to pass up the value here. I’ll also note that Charlie munger has been acquiring a substantial stake in BABA, which makes me more confident in its value at its current price.

Rossin Apr 12, 2022, 8:50 PM
2 points
on: The Efficient LessWrong Hypothesis—Stock Investing Competition
Alibaba (BABA) - the stock price has been pulled down by fear about regulation, delisting, and most recently instability in China as it’s zero covid policy fails. However, as far as I can tell, the price is insanely low for the amount of revenue Alibaba generates and the market share that it holds in China.

Rossin Apr 12, 2022, 8:35 PM
3 points
on: Genetic Enhancement: a Strategy for Long(ish) AGI Timeline Worlds
Current bioethics norms will strongly condemn this sort of research, which may make it challenging to pursue in the nearish term. The consensus is strongly against, which will make acquiring funding difficult and any human CRISPR editing is completely off the table for now. For example, He Jiankui CRISPR edited some babies in China to make them less susceptible to HIV and went to prison for it.

Rossin Apr 9, 2022, 7:34 PM
1 point
in reply to: eg’s comment on: Strategies for keeping AIs narrow in the short term
Do I understand you correctly as endorsing something like: it doesn’t matter how narrow an optimization process is, if it becomes powerful enough and is not well aligned, it still ends in disaster

Strategies for keeping AIs narrow in the short term

RossinApr 9, 2022, 4:42 PM

9 points

3 comments3 min readLW link

Rossin Apr 7, 2022, 11:21 PM
4 points
on: What I Was Thinking About Before Alignment
I’m not sure the problem in biology is decoding. At least not in the same sense it is with neural networks. I see the main difficulty in biology more one of mechanistic inference where a major roadblock may be getting better measurements of what is going on in cells over time rather some algorithm that’s just going to be able to overcome the fact that you’re getting both very high levels of molecular noise in biological data and single snapshots in time that are difficult to place in context. With a neural network you have the parameters and it seems reasonable to say you just need some math to make it more interpretable.

Whereas in biology I think we likely need both better measurements and better tools. I’m not sure the same tools would be particularly applicable to the ai interpretability problem either.

If, for example, I managed to create mathematical tools to reliably learn mechanistic dependencies between proteins and/or genes from high dimensional biological data sets, it’s not clear to me that would be easily applicable to extracting bayes nets from large neural networks.

I’m coming at this from a comp bio angle so it’s possible I’m just not seeing the connections well, having not worked in both fields.

Rossin Sep 4, 2021, 9:12 PM
1 point
in reply to: Dave Orr’s comment on: Are there substantial research efforts towards aligning narrow AIs?
In general the observation from working in the field is that if you have a simple metric, people will figure out how to game it. So you need to build in a lot of safeguards, and you need to evolve all the time as the spammers/abusers evolve. There’s no end point, no place where you think you’re done, just an ever changing competition.
That’s what I was trying to point at in regards to the problem not being patchable. It doesn’t seem like there is some simple patch you can write, and then be done. A solution that would work more permanently seems to have some of the “impossible” character of AGI alignment and trying to solve it on that level seems like it could be valuable for AGI alignment researchers.

[Question] Are there substantial research efforts towards aligning narrow AIs?

RossinSep 4, 2021, 6:40 PM

11 points

4 comments2 min readLW link

Rossin Jul 5, 2021, 6:25 PM
2 points
in reply to: Viliam’s comment on: The Point of Trade
Another is leisure. People would still need breaks and want to use the work they had done in the past to purchase the ability to stay at a beach resort for a while.

Rossin Apr 28, 2021, 7:01 PM
4 points
in reply to: Beeblebrox’s comment on: Let’s double crux theism
In your opinion, would a resurrection/afterlife change this equation at all?
Yes, an afterlife transforms death (at least relatively low-pain deaths) into something that’s really not that bad. It’s sad in the sense you won’t see a person for a while, but that’s not remotely on the level of a person being totally obliterated, which is my current interpretation of death on the basis that I see no compelling evidence for an afterlife. Considering that one’s mental processes continuing after the brain ceases to function would rely on some mechanism unknown to our current understanding of reality, I would want considerable evidence to consider an afterlife plausible.
To answer your thought experiment—it depends. For myself, almost certainly. Some friends and family I have discussed cryonics with have expressed little to no interest in living beyond the “normal” biological amount of time. I think they are misguided, but I would not presume to choose this for them. Those who have expressed interest in cryonics I would probably sign up. However, I think your analogy may break down in that it seems an omnipotent god should not need immense suffering to bring people to an afterlife. I don’t think a god need prevent all suffering to be good or benevolent, but I think there is a level of unjust suffering a good god would not allow.

Rossin Apr 28, 2021, 5:14 PM
2 points
on: Let’s double crux theism
I had a really hard time double cruxing this, because I don’t actually feel at all uncertain about the existence of a benevolent and omnipotent god. I realized partway through that I wasn’t doing a good job arguing both sides and stopped there. I’m posting this comment anyway, in case it makes for useful discussion.
You attribute god both benevolence and omnipotence, which I think is extremely difficult to square with the world we inhabit, in which natural disasters kill and injure thousands, in which children are born with debilitating diseases, and good people die young in accidents that were no fault of their own. One can of course come up with a myriad of possible explanations for these observations, but I think they are a sharp departure from what a naive mind would expect to see in a world created by an all-powerful benevolent being. I’m trying to come up with explanations of these that don’t feel completely forced. The best I can come up with is the idea of a divine plan in which the events allow people to fulfill their destiny and become who they are meant to be. Yet, while such might make for good storytelling, I don’t think they actually improve the lives of people these things happen to.
Relatedly, you have the problem of human evil. I think the standard reply is that God gives humans the free will to choose good or evil so that He may judge them. I would contend that free will only produces evil in beings created by God insofar as their creator designed them in such a way that they would often choose to do evil things given the world in which they are placed (e.g. why does God make pedophiles?).
Another consideration is the intense competition that exists in the world between all life forms for limited resources does not approximate what I think one would naively expect of a singular God, but looks much more like multiple opposed forces acting against one another. This would at least seem to favor polytheism over monotheism, but given the simpler explanation of evolution, I think neither is necessary. I think Eliezer made this point better https://www.lesswrong.com/posts/pLRogvJLPPg6Mrvg4/an-alien-god.

Rossin Feb 18, 2021, 3:49 PM
1 point
in reply to: remizidae’s comment on: The Problem with Giving Advice
Agree, I think the problem definitely gets amplified by power or status differentials.
I do think that people often forget to think critically about all kinds of things because their brain just decides to accept it on the 5 second level and doesn’t promote the issue as needing thorough consideration. I find all kinds of poorly justified “facts”/advice in my mind because of something I read or someone said that I failed to properly consider.
Even when someone does take the time to think about advice though I think it’s easy for things to go wrong. The reason someone is asking for advice may be that they simply do not have the expertise to evaluate claims about challenge X on their own merits. Another possibility is that someone can realize the advice is good for them but overcorrect, essentially trading one problem for another.

Rossin Feb 18, 2021, 3:38 PM
1 point
in reply to: Stuart Anderson’s comment on: The Problem with Giving Advice
The main thing people fail to consider when giving advice is that advice isn’t what’s wanted.
I fully agree, this post was trying to get at what happens when people do want advice and thus may take bad advice.
Advice comes with no warranty. If some twit injures themselves doing what I told them to (wrongly) then that’s 100% on them.
I think in some cases this is generally a fair stance (though I think I would still like to prevent people from misapplying my advice if possible), but if you are in a position of power or influence over someone I’m not sure it applies (e.g. sports coaches telling all their players to work harder and not taking the time to make sure that some of them aren’t being pushed to overtraining by this advice).
Failing all of that, say “What choice would you make if I wasn’t here?” and then barring them saying something outlandish you just say “Then do that”. One way or another they’ll get better at thinking for themselves.
That sounds like a very reasonable approach.

The Problem with Giving Advice

RossinFeb 17, 2021, 9:52 PM

13 points

7 comments2 min readLW link

Rossin Feb 17, 2021, 5:05 PM
3 points
on: Silence
I think the metaphor of “fast-forwarding” is a very useful way to view a lot of my behavior. Having thought about this for a while though, I’m not sure fast-forwarding is always a bad thing. I find it can be mentally rejuvenating in a way that introspection is not (e.g. if I’ve been working for a long period and my brain is getting tired I can often quickly replenish my mental resources by watching a short video or reading a chapter of a fantasy novel after which I’m able to begin working again, whereas I find sitting and reflecting to still require some mental energy).
Of course, this is an important habit to keep an eye on. I sometimes find myself almost unconsciously opening youtube when I don’t actually need a break which I’ve been trying to get myself to stop doing.

Rossin Jan 26, 2021, 11:30 PM
4 points
on: Hammers and Nails
Favorite technique: Argue with yourself about your conclusions.
By which I mean if I have any reasonable doubt about some idea, belief, or plan I split my mind into two debaters who take opposite sides of the issue, each of which wants to win and I use my natural competitiveness to drive insight into an issue.
I think the accustomed use of this would be investigating my deeply held beliefs and trying to get to their real weak points, but it is also useful for:
1. Examining my favored explanation of a set of data
2. Figuring out whether I need to change the way I’m presenting a set of data after I have already sunk costs into making the visualizations
3. Understanding my partner’s perspective after an argument.
4. Preparing for expected real life arguments
5. Force myself to understand an issue better, even when I don’t expect I will change my mind about it.
6. Questioning whether the way I acted in a situation was acceptable.
7. An exercise in analytic thinking.
8. Evaluating two plausible arguments I’ve heard but don’t have any particularly strong feelings on
9. Deciding whether to make a purchase
10. Comparing two alternative plans

Rossin

Strate­gies for keep­ing AIs nar­row in the short term

[Question] Are there sub­stan­tial re­search efforts to­wards al­ign­ing nar­row AIs?

The Prob­lem with Giv­ing Advice

Strategies for keeping AIs narrow in the short term

[Question] Are there substantial research efforts towards aligning narrow AIs?

The Problem with Giving Advice