David_Kristoffersson

Karma: 277

CEO of Convergence, an x-risk research and impact organization.

David_Kristoffersson 3 Apr 2020 17:28 UTC
1 point
in reply to: jimmy’s comment on: mind viruses about body viruses
Excellent comment, thank you! Don’t let the perfect be the enemy of the good if you’re running from an exponential growth curve.

David_Kristoffersson 24 Feb 2020 13:16 UTC
6 points
on: The recent NeurIPS call for papers requires authors to include a statement about the potential broader impact of their work
Looks promising to me. Technological development isn’t by default good.

Though I agree with the other commenters that this could fail in various ways. For one thing, if a policy like this is introduced without guidance on how to analyze the societal implications, people will think of wildly different things. ML researchers aren’t by default going to have the training to analyze societal consequences. (Well, who does? We should develop better tools here.)

Information hazards: Why you should care and what you can do

MichaelA, JustinShovelain, David_Kristoffersson and algekalipso

23 Feb 2020 20:47 UTC

18 points

4 comments15 min readLW link

David_Kristoffersson 21 Feb 2020 12:13 UTC
2 points
in reply to: leggi’s comment on: Jan Bloch’s Impossible War
Or, at least, include a paragraph or a few to summarize it!

Mapping downside risks and information hazards

MichaelA, JustinShovelain and David_Kristoffersson

20 Feb 2020 14:46 UTC

23 points

0 comments9 min readLW link

State Space of X-Risk Trajectories

David_Kristoffersson9 Feb 2020 13:56 UTC

11 points

0 comments9 min readLW link

David_Kristoffersson 3 Feb 2020 11:10 UTC
3 points
in reply to: willbradshaw’s comment on: A point of clarification on infohazard terminology
Some quick musings on alternatives for the “self-affecting” info hazard type:
- Personal hazard
- Self info hazard
- Self hazard
- Self-harming hazard

David_Kristoffersson 30 Jan 2020 14:44 UTC
1 point
on: AI alignment concepts: philosophical breakers, stoppers, and distorters
I wrote this comment to an earlier version of Justin’s article:
It seems to me that most of the ‘philosophical’ problems are going to get solved as a matter of solving practical problems in building useful AI. You could call ML systems, AI, that is getting developed now ‘empirical’. From the perspective of the people building current systems, they likely don’t consider what they’re doing as solving philosophical problems. Symbol grounding problem? Well, an image classifier built on a convolutional neural network learns to get quite proficient at grounding out classes like ‘cars’ and ‘dogs’ (symbols) from real physical scenes.
So, the observation I want to make, is that the philosophical problems we can think of that might trip over a system are likely to turn out to look like technical/research/practical problems that need to be solved by default for practical reasons in order to make useful systems.
The image classification problem wasn’t solved in one day, but it was solved using technical skills, engineering skills, more powerful hardware, and more data. People didn’t spend decades discussing philosophy: the problem was solved from some advances in the ideas of building neural networks and from more powerful computers.
Of course, image classification doesn’t solve the symbol grounding problem in full. But other aspects of symbol grounding that people might find mystifying are getting solved piece-wise, as researchers and engineers are solving practical problems of AI.
Let’s look at a classic problem formulation from MIRI, ‘Ontology Identification’:
Technical problem (Ontology Identification). Given goals specified in some ontology and a world model, how can the ontology of the goals be identified in the world model? What types of world models are amenable to ontology identification? For a discussion, see Soares (2015).
When you create a system that performs any function in the real world, you are in some sense giving it goals. Reinforcement Learning-trained systems are pursuing ‘goals’. An autonomous car takes you from chosen points A to chosen points B; it has the overall goal of transporting people. The ontology identification problem is getting solved piece-wise as a practical matter. Perhaps the MIRI-style theory could give us a deeper understanding that helps us avoid some pitfalls, but it’s not clear why these wouldn’t be caught as practical problems.
What would a real philosophical landmine look like? A class of philosophical problems that wouldn’t get solved as a practical matter, and pose a risk for harm against humanity would be the real philosophical landmines.

David_Kristoffersson 6 Sep 2019 0:55 UTC
9 points
in reply to: Jan Kulveit’s comment on: AIXSU—AI and X-risk Strategy Unconference
I expect the event to have no particular downside risks, and to give interesting input and spark ideas in experts and novices alike. Mileage will vary, of course. Unconferences foster dynamic discussion and a living agenda. If it’s risky to host this event, then I expect AI strategy and forecasting meetups and discussions at EAG to be risky and they should also not be hosted.
I and other attendees of AIXSU pay careful attention to potential downside risks. I also think it’s important we don’t strangle open intellectual advancement. We need to figure out what we should talk about; not that we shouldn’t talk.
AISC: To clarify: AI safety camp is different and puts bigger trust in the judgement of novices, since teams are generally run entirely by novices. The person who proposed running a strategy AISC found the reactions from experts to be mixed. He also reckoned the event would overlap with the existing AI safety camps, since they already include strategy teams.
Potential negative side effects of strategy work is a very important topic. Hope to discuss it with attendees at the unconference!

AIXSU—AI and X-risk Strategy Unconference

David_Kristoffersson3 Sep 2019 11:35 UTC

24 points

3 comments2 min readLW link

David_Kristoffersson 17 Aug 2019 14:48 UTC
5 points
on: Three Stories for How AGI Comes Before FAI
We can subdivide the security story based on the ease of fixing a flaw if we’re able to detect it in advance. For example, vulnerability #1 on the OWASP Top 10 is injection, which is typically easy to patch once it’s discovered. Insecure systems are often right next to secure systems in program space.
Insecure systems are right next to secure systems, and many flaws are found. Yet, the larger systems (the company running the software, the economy, etc) manage to correct somehow. It’s because there are mechanisms in the larger systems poised to patch the software when flaws are discovered. Perhaps we could fit and optimize this flaw-exploit-patch-loop in security as a technique for AI alignment.
If the security story is what we are worried about, it could be wise to try & develop the AI equivalent of OWASP’s Cheat Sheet Series, to make it easier for people to find security problems with AI systems. Of course, many items on the cheat sheet would be speculative, since AGI doesn’t actually exist yet. But it could still serve as a useful starting point for brainstorming.
This sounds like a great idea to me. Software security has a very well developed knowledge base at this point and since AI is software, there should be many good insights to port.
What possibilities aren’t covered by the taxonomy provided?
Here’s one that occurred to me quickly: Drastic technological progress (presumably involving AI) destabilizes society and causes strife. In this environment with more enmity, safety procedures are neglected and UFAI is produced.

David_Kristoffersson 17 Aug 2019 13:38 UTC
11 points
on: Project Proposal: Considerations for trading off capabilities and safety impacts of AI research
This seems like a valuable research question to me. I have a project proposal in a drawer of mine that is strongly related: “Entanglement of AI capability with AI safety”.

David_Kristoffersson 12 Jul 2019 7:08 UTC
1 point
in reply to: Wei Dai’s comment on: A case for strategy research: what it is and why we need more of it
My guess is that the ideal is to have semi-independent teams doing research. Independence in order to better explore the space of questions, and some degree of plugging in to each other in order to learn from each other and to coordinate.
Are there serious info hazards, and if so can we avoid them while still having a public discussion about the non-hazardous parts of strategy?
There are info hazards. But I think if we can can discuss Superintelligence publicly, then yes; we can have a public discussion about non-hazardous parts of strategy.
Are there enough people and funding to sustain a parallel public strategy research effort and discussion?
I think you could get a pretty lively discussion even with just 10 people, if they were active enough. I think you’d need a core of active posters and commenters, and there needs to be enough reason for them to assemble.

David_Kristoffersson 21 Jun 2019 18:02 UTC
3 points
in reply to: Wei Dai’s comment on: A case for strategy research: what it is and why we need more of it
Nice work, Wei Dai! I hope to read more of your posts soon.
However I haven’t gotten much engagement from people who work on strategy professionally. I’m not sure if they just aren’t following LW/AF, or don’t feel comfortable discussing strategically relevant issues in public.
A bit of both, presumably. I would guess a lot of it comes down to incentives, perceived gain, and habits. There’s no particular pressure to discuss on LessWrong or the EA forum. LessWrong isn’t perceived as your main peer group. And if you’re at FHI or OpenAI, you’ll have plenty contact with people who can provide quick feedback already.

David_Kristoffersson 21 Jun 2019 17:09 UTC
1 point
in reply to: Davidmanheim’s comment on: A case for strategy research: what it is and why we need more of it
I’m very confused why you think that such research should be done publicly, and why you seem to think it’s not being done privately.
I don’t think the article implies this:
Research should be done publicly
The article states: “We especially encourage researchers to share their strategic insights and considerations in write ups and blog posts, unless they pose information hazards.”
Which means: share more, but don’t share if you think there are possible negative consequences of it.
Though I guess you could mean that it’s very hard to tell what might lead to negative outcomes. This is a good point. This is why we (Convergence) is prioritizing research on information hazard handling and research shaping considerations.
it’s not being done privately
The article isn’t saying strategy research isn’t being done privately. What it is saying is that we need more strategy research and should increase investment into it.
Given the first sentence, I’m confused as to why you think that “strategy research” (writ large) is going to be valuable, given our fundamental lack of predictive ability in most of the domains where existential risk is a concern.
We’d argue that to get better predictive ability, we need to do strategy research. Maybe you’re saying the article makes it looks like we are recommending any research that looks like strategy research? This isn’t our intention.

David_Kristoffersson 24 Jan 2019 11:15 UTC
1 point
in reply to: ioannes’s comment on: AI Safety Research Camp—Project Proposal
Yes—the plan is to have these on an ongoing basis. I’m writing this just as the deadline was passed for the one planned to April.
Here’s the web site: https://aisafetycamp.com/
The facebook is also a good place to keep tabs on it: https://www.facebook.com/groups/348759885529601/

David_Kristoffersson 5 Feb 2018 9:42 UTC
8 points
on: Beware Social Coping Strategies
Your relationship with other people is a macrocosm of your relationship with yourself.
I think there’s something to that, but it’s not that general. For example, some people can be very kind to others but harsh with themselves. Some people can be cruel to others but lenient to themselves.
If you can’t get something nice, you can at least get something predictable
The desire for the predictable is what Autism Spectrum Disorder is all about, I hear.

David_Kristoffersson 2 Feb 2018 4:32 UTC
3 points
in reply to: Remmelt’s comment on: Critch on “Taking AI Risk Seriously”
Here’s the Less Wrong post for the AI Safety Camp!

AI Safety Research Camp—Project Proposal

David_Kristoffersson2 Feb 2018 4:25 UTC

29 points

11 comments8 min readLW link

David_Kristoffersson 26 Oct 2016 8:57 UTC
1 point
in reply to: LoganStrohl’s comment on: A Fable of Science and Politics
It’s bleen, without a moment’s doubt.

David_Kristoffersson

In­for­ma­tion haz­ards: Why you should care and what you can do

Map­ping down­side risks and in­for­ma­tion hazards

State Space of X-Risk Trajectories

AIXSU—AI and X-risk Strat­egy Unconference

AI Safety Re­search Camp—Pro­ject Proposal

Information hazards: Why you should care and what you can do

Mapping downside risks and information hazards

AIXSU—AI and X-risk Strategy Unconference

AI Safety Research Camp—Project Proposal