New forum for MIRI research: Intelligent Agent Foundations Forum

orthonormal20 Mar 2015 0:35 UTC

53 points

Site Meta Machine Intelligence Research Institute (MIRI)

Today, the Machine Intelligence Research Institute is launching a new forum for research discussion: the Intelligent Agent Foundations Forum! It’s already been seeded with a bunch of new work on MIRI topics from the last few months.

We’ve covered most of the (what, why, how) subjects on the forum’s new welcome post and the How to Contribute page, but this post is an easy place to comment if you have further questions (or if, maths forbid, there are technical issues with the forum instead of on it).

But before that, go ahead and check it out!

(Major thanks to Benja Fallenstein, Alice Monday, and Elliott Jin for their work on the forum code, and to all the contributors so far!)

EDIT 3/22: Jessica Taylor, Benja Fallenstein, and I wrote forum digest posts summarizing and linking to recent work (on the IAFF and elsewhere) on reflective oracle machines, on corrigibility, utility indifference, and related control ideas, and on updateless decision theory and the logic of provability, respectively! These are pretty excellent resources for reading up on those topics, in my biased opinion.

What links here?

orthonormal20 Mar 2015 0:35 UTC

53 points

43 comments1 min readLW link Archive

Site Meta Machine Intelligence Research Institute (MIRI)

orthonormal 19 Mar 2015 4:58 UTC
49 points
0
Why a separate forum, you ask? For one thing, a narrower focus is better for this purpose: sending academic researchers to Less Wrong led to them wondering why people were talking about polyamory, cryonics, and Harry Potter on the same site as the decision theory posts...
What links here?
- amacfie's comment on Ottawa LW Meetup Saturday April 16th by Cyan (6 Apr 2015 15:42 UTC; 6 points)
Vladimir_Nesov 19 Mar 2015 14:10 UTC
20 points
From the welcome post:

Examples of subjects that are off-topic (and which moderators might close down) include [...] anything that cannot (yet) be usefully formalized or modeled mathematically.

This seems open to an undesirable interpretation (discussion of how to formulate things that haven’t yet been formulated and exercises in working through pre-formal examples, in an attempt to isolate principles that can then be explored more systematically). It might also encourage cargo cult formalization.
- orthonormal 23 Mar 2015 21:56 UTC
  16 points
  Parent
  I’ve edited it to the following:
  
  But the list does help us to point out what we consider to be on-topic in this forum. Besides the topics mentioned there, other relevant subjects include groundwork for self-modifying agents, abstract properties of goal systems, tractable theoretical or computational models of the topics above, and anything else that is directly connected to MIRI’s research mission.
  
  It’s important for us to keep the forum focused, though; there are other good places to talk about subjects that are more indirectly related to MIRI’s research mission, and the moderators here may close down discussions on subjects that aren’t a good fit for this mission. Some examples of subjects that we would consider off-topic (unless directly applied to a more relevant area) include general advances in artificial intelligence and machine learning, general mathematical logic, general philosophy of mind, general futurism, existential risks, effective altruism, human rationality, and non-technical philosophizing.
  
  In particular, it now discourages “general advances in artificial intelligence and machine learning”, and discourages “non-technical philosophizing” rather than “anything that cannot (yet) be usefully formalized or modeled mathematically”. Is this an improvement?
- RyanCarey 22 Mar 2015 19:45 UTC
  13 points
  Parent
  
  Examples of subjects that are off-topic (and which moderators might close down) include recent advances in artificial intelligence and machine learning,
  
  This is kind-of surprising. I get that you want it to be an AI safety research forum rather than an AI research forum, and that’s reasonable but I think this wording could put off lots of fairly safety-friendly AI researchers from participating in conversations that are mostly about safety. Don’t you want to be at least a little bit inclusive of AI researchers? Maybe what you’re going for is something like: “discussing advances in artificial intelligence without connection to formal safety measures”?
- orthonormal 19 Mar 2015 17:33 UTC
  10 points
  Parent
  Right. I wanted to encourage semi-formalized topics, but not completely non-technical philosophizing. Can someone suggest a better wording?
  - dxu 19 Mar 2015 22:47 UTC
    7 points
    Parent
    How about
    
    Examples of subjects that are off-topic (and which moderators might close down) include [...] anything that cannot (yet) be usefully formalized or modeled mathematically, unless you are talking specifically about how to formalize said area.
  - Mark_Friedenbach 21 Mar 2015 16:34 UTC
    1 point
    Parent
    Prohibit what you don’t want, non-technical philosophizing, rather than a blanket prohibition that covers all sorts of other things. For example, what about existing AGI designs that lack an unified underlying formal model as of yet, e.g. OpenCog? They’re apparently off limits, even a discussion involving experimental data of real world systems. That seems wrong.
    
    EDIT: I found the original source:
    
    Examples of subjects that are off-topic (and which moderators might close down) include recent advances in artificial intelligence and machine learning,
    
    Wow, I just… wow. Subjects that are off-topic: artificial intelligence and machine learning. Well enjoy your anti-rationalist ivory tower; I won’t be participating.
    - orthonormal 21 Mar 2015 23:39 UTC
      7 points
      Parent
      Sorry you feel that way, but it’s kind of essential that the forum is not about the latest AI techniques, but about groundwork for the kind of safety research that could stand up to smarter-than-human AI. There are plenty of great places on the Internet for discussing those other topics!
      - Mark_Friedenbach 22 Mar 2015 1:38 UTC
        8 points
        Parent
        The problem is that you think those are two separate things, that the safety research which could stand up to smarter than human artificial intelligence is something that will arise separate from the work that is being done on artificial intelligence.
        
        And for what it’s worth there really isn’t a place to discuss practical safety and artificial intelligence.
        jessicat 22 Mar 2015 2:40 UTC
        10 points
        Parent
        I think a post saying something like “Deep learning architectures are/are not able to learn human values because of reasons X, Y, Z” would definitely be on topic. As an example of something like this, I wrote a post on the safety implications of statistical learning theory. However, an article about how deep learning algorithms are performing on standard machine learning tasks is not really on topic.
        
        I share your sentiment that safety research is not totally separate from other AI research. But I think there is a lot to be done that does not rely on the details of how practical algorithms work. For example, we could first create a Friendly AI design that relies on Solomonoff induction, and then ask to what extent practical algorithms (like deep learning) can predict bits well enough to be substituted for Solomonoff induction in the design. The practical algorithms are more of a concern when we already have an solution that uses unbounded computing power and are trying to scale it down to something we can actually run.
        Mark_Friedenbach 22 Mar 2015 17:32 UTC
        4 points
        Parent
        First of all, purposefully limiting scope to protecting against only the runaway superintelligence scenario is preventing a lot of good that could be done right now, and keeps your work from having practical applications it otherwise would have. For example, right now somewhere deep in Google and Facebook there are machine learning recommendation engines that are suggesting the display of whisky ads to alcoholics. Learning how to create even a simple recommendation engine whose output is constrained by the values of its creators would be a large step forward and would help society today. But I guess that’s off-topic.
        
        Second, even if you buy the argument that existential risk trumps all and we should ignore problems that could be solved today such as that recommendation engine example, it is demonstrably not the case in history that the fastest way to develop a solution is to ignore all practicalities and work from theory backwards. No, in almost every case what happens is the practical and the theoretical move forward hand in hand, with each informing progress in the other. You solve the recommendation engine example not because it has the most utilitarian direct outcomes, but because the theoretical and practical outcomes are more likely to be relevant to the larger problem than an ungrounded problem chosen by different means. And on the practical side, you will have engineers coming forward the beginnings of solutions—“hey I’ve been working on feedback controls, and this particular setup seems to work very well in the standard problem sets...” In the real world theoreticians more often than not spend their time proving the correctness of the work of a technologist, and then leveraging that theory to improve upon it.
        
        Third, there are specific concerns I have about the approach. Basically time spent now on unbounded AIXI constructs is probably completely wasted. Real AGIs don’t have Solomonoff inductors or anything resembling them. Thinking that unbounded solutions could be modified to work on a real, computable superintelligence betrays a misunderstanding of the actual utility of AIXI. AIXI showed that all the complexity of AGI lies in the practicalities, because the pure uncomputable theory is dead simple but utterly divorced from practice. AIXI brought some respectability to the field by having some theoretical backing, even if that theory is presently worse than useless in as much as it is diverting otherwise intelligent people from making meaningful contributions.
        
        Finally, there’s the simple matter that an ignore-all-practicalities theory-first approach is useless until it nears completion. My current trajectory places the first AGI at 10 to 15 years out, and the first self-improving superintelligence shortly thereafter. Will MIRI have practical results in that time frame? The schedule is not going to stop and wait for perfection. So if you want to be relevant, then stay relevant.
        jessicat 22 Mar 2015 18:26 UTC
        11 points
        Parent
        
        Learning how to create even a simple recommendation engine whose output is constrained by the values of its creators would be a large step forward and would help society today.
        
        I think something showing how to do value learning on a small scale like this would be on topic. It might help to expose the advantages and disadvantages of algorithms like inverse reinforcement learning.
        
        I also agree that, if there are more practical applications of AI safety ideas, this will increase interest and resources devoted to AI safety. I don’t really see those applications yet, but I will look out for them. Thanks for bringing this to my attention.
        
        it is demonstrably not the case in history that the fastest way to develop a solution is to ignore all practicalities and work from theory backwards
        
        I don’t have a great understanding of the history of engineering, but I get the impression that working from the theory backwards can often be helpful. For example, Turing developed the basics of computer science before sufficiently general computers existed.
        
        My current impression is that solving FAI with a hypercomputer is a fundamentally easier problem that solving it with a bounded computer, and it’s hard to say much about the second problem if we haven’t made steps towards solving the first one. On the other hand, I do think that concepts developed in the AI field (such as statistical learning theory) can be helpful even for creating unbounded solutions.
        
        AIXI showed that all the complexity of AGI lies in the practicalities, because the pure uncomputable theory is dead simple but utterly divorced from practice.
        
        I would really like it if the pure uncomputable theory of Friendly AI were dead simple!
        
        Anyway, AIXI has been used to develop more practical algorithms. I definitely approach many FAI problems with the mindset that we’re going to eventually need to scale this down, and this makes issues like logical uncertainty a lot more difficult. In fact, Paul Christiano has written about tractable logical uncertainty algorithms, which is a form of “scaling down an intractable theory”. But it helped to have the theory in the first place before developing this.
        
        an ignore-all-practicalities theory-first approach is useless until it nears completion
        
        Solutions that seem to work for practical systems might fail for superintelligence. For example, perhaps induction can yield acceptable practical solutions for weak AIs, but does not necessarily translate to new contexts that a superintelligence might find itself in (where it has to make pivotal decisions without training data for these types of decisions). But I do think working on these is still useful.
        
        My current trajectory places the first AGI at 10 to 15 years out, and the first self-improving superintelligence shortly thereafter. Will MIRI have practical results in that time frame?
        
        I consider AGI in the next 10-15 years fairly unlikely, but it might be worth having FAI half-solutions by then, just in case. Unfortunately I don’t really know a good way to make half-solutions. I would like to hear if you have a plan for making these.
        Houshalter 23 Mar 2015 0:03 UTC
        10 points
        Parent
        
        I don’t have a great understanding of the history of engineering, but I get the impression that working from the theory backwards can often be helpful. For example, Turing developed the basics of computer science before sufficiently general computers existed.
        
        The first computer was designed by Babbage who was mostly interested in practical applications (although admitedly it was never built.) 100 years later Konrad Zuse developed the first working computer and was also for practical purposes. I’m not sure if he was even aware of Turing’s work.
        
        Not that Turing didn’t contribute anything to the development of computers, but I’m not sure if it’s a good example of theory preceding practice.
        
        In AI in general this seems to be the case. Neural networks have been around forever, but they keep making progress every time computers get a bit faster. For the most part it’s not like scientists have invented good algorithms and are waiting around for computers to get fast enough to run them. Rather the computers get a bit faster and then it drives a new wave of progress and lets researchers experiment with new stuff.
        
        Anyway, AIXI has been used to develop more practical algorithms.
        
        Forgive me if I’m mistaken, but is AIXI really that novel? From a theoreticians point of view maybe, but from the practical side of AI it’s just a reformulation of reinforcement learning. MC AIXI is impressive because it works at all, not because there aren’t any other algorithms that can learn to play pac man.
        Mark_Friedenbach 22 Mar 2015 23:15 UTC
        2 points
        Parent
        
        I don’t have a great understanding of the history of engineering, but I get the impression that working from the theory backwards can often be helpful. For example, Turing developed the basics of computer science before sufficiently general computers existed.
        
        One way to fix the lack of historical perspective is to actively involve engineers and their projects into the MIRI research agenda, rather than specifically excluding them.
        
        Regarding your example, Turing hardly invented computing. If anything that honor probably goes to Charles Babbage who nearly a century earlier designed the first general computation devices, or to the various business equipment corporations that had been building and marketing special purpose computers for decades after Babbage and prior to the work of Church and Turing. It is far, far easier to provide theoretical backing to a broad category of devices which are already known to work than to invent out of whole cloth a field with absolutely no experimental validation.
        
        My current impression is that solving FAI with a hypercomputer is a fundamentally easier problem that solving it with a bounded computer, and it’s hard to say much about the second problem if we haven’t made steps towards solving the first one.
        
        The first statement is trivially true: everything is easier on a hypercomputer. But who cares? we don’t have hypercomputers.
        
        The second statement is the real meat of the argument—that “it’s hard to say much about the [tractable FAI] if we haven’t made steps towards solving the [uncomputable FAI].” While on the surface that seems like a sensible statement, I’m afraid your intuition fails you here.
        
        Experience with artificial intelligence has shown that there does not seem to be any single category of tractable algorithms which provides general intelligence. Rather we are faced with a dizzying array of special purpose intelligences which in no way resemble general models like AIXI, and the first superintelligences are likely to be some hodge-podge integration of multiple techniques. What we’ve learned from neuroscience and modern psychology basically backs this up: the human mind at least achieves its generality from a variety of techniques, not some easy-to-analyze general principle.
        
        It’s looking more and more likely that the tricks we will use to actually achieve general intelligence will not resemble in the slightest the simple unbounded models for general intelligence that MIRI currently plays with. It’s not unreasonable to wonder then whether an unbounded FAI proof would have any relevance to an AGI architecture which must be built on entirely different principles.
        
        I consider AGI in the next 10-15 years fairly unlikely, but it might be worth having FAI half-solutions by then, just in case. Unfortunately I don’t really know a good way to make half-solutions. I would like to hear if you have a plan for making these.
        
        The goal is to achieve a positive singularity, not friendly AI. The easiest way to do that on a short timescale is to not require friendliness at all. Use idiot-savant superintelligence only to solve the practical engineering challenges which prevent us from directly augmenting human intelligences, then push a large group of human beings through cognitive enhancement programmes in lock step.
        
        What does that mean in terms of a MIRI research agenda? Revisit boxing. Evaluate experimental setups that allow for a presumed-unfriendly machine intelligence but nevertheless has incentive structures or physical limitations which prevent it from going haywire. Devise traps, boxes, and tests for classifying how dangerous a machine intelligence is, and containment protocols. Develop categories of intelligences which lack foundation social skills critical to manipulating its operators. Etc. Etc.
        
        Anyway, AIXI has been used to develop more practical algorithms.
        
        In section 8.2 of the very document you linked to, it is pointed out why stochastic AIXI will not scale to problems of real world complexity or useful planning horizons.
        jessicat 23 Mar 2015 6:31 UTC
        6 points
        Parent
        Thanks for the response. I should note that we don’t seem to disagree on the fact that a significant portion of AI safety research should be informed by practical considerations, including current algorithms. I’m currently getting a masters degree in AI while doing work for MIRI, and a substantial portion of my work at MIRI is informed by my experience with more practical systems (including machine learning and probabilistic programming). The disagreement is more that you think that unbounded solutions are almost entirely useless, while I think they are quite useful.
        
        Rather we are faced with a dizzying array of special purpose intelligences which in no way resemble general models like AIXI, and the first superintelligences are likely to be some hodge-podge integration of multiple techniques.
        
        My intuition is that if you are saying that these techniques (or a hodgepodge of them) work, you are referring to some kind of criteria that they perform well on in different situations (e.g. ability to do supervised learning). Sometimes, we can prove that the algorithms perform well (as in statistical learning theory); other times, we can guess that they will perform on future data based on how they perform on past data (while being wary of context shifts). We can try to find ways of turning things that satisfy these criteria into components in a Friendly AI (or a safe utility satisficer etc.), without knowing exactly how these criteria are satisfied.
        
        Like, this seems similar to other ways of separating interface from implementation. We can define a machine learning algorithm without paying too much attention to what programming language it is programmed in, or how exactly the code gets compiled. We might even start from pure probability theory and then add independence assumptions when they increase performance. Some of the abstractions are leaky (for example, we might optimize our machine learning algorithm for good cache performance), but we don’t need to get bogged down in the details most of the time. We shouldn’t completely ignore the hardware, but we can still usefully abstract it.
        
        What does that mean in terms of a MIRI research agenda? Revisit boxing. Evaluate experimental setups that allow for a presumed-unfriendly machine intelligence but nevertheless has incentive structures or physical limitations which prevent it from going haywire. Devise traps, boxes, and tests for classifying how dangerous a machine intelligence is, and containment protocols. Develop categories of intelligences which lack foundation social skills critical to manipulating its operators. Etc. Etc.
        
        I think this stuff is probably useful. Stuart Armstrong is working on some of these problems on the forum. I have thought about the “create a safe genie, use it to prevent existential risks, and have human researchers think about the full FAI problem over a long period of time” route, and I find it appealing sometimes. But there are quite a lot of theoretical issues in creating a safe genie!
        Mark_Friedenbach 24 Mar 2015 23:31 UTC
        3 points
        Parent
        
        I have thought about the “create a safe genie, use it to prevent existential risks, and have human researchers think about the full FAI problem over a long period of time” route, and I find it appealing sometimes. But there are quite a lot of theoretical issues in creating a safe genie!
        
        That is absolutely not a route I would consider. If that’s what you took away from my suggestion, please re-read it! My suggestion is that MIRI should consider pathways to leveradging superintelligence which don’t involve agent-y processes (genies) at all. Processes which are incapable of taking action themselves, and whose internal processes are real-time audited and programmatically constrained to make deception detectable. Tools used as cognitive enhancers, not stand-alone cognitive artifacts with their own in-built goals.
        
        SIAI spent a decade building up awareness of the problems that arise from superintelligent machine agents. MIRI has presumed from the start that the way to counteract this threat is to build a provably-safe agent. I have argued that this is the wrong lesson to draw—the better path forward is to not create non-human agents of any type, at all!
        Expand this thread
        Kaj_Sotala 26 Mar 2015 8:47 UTC
        5 points
        Parent
        How would you prevent others from building agent-type AIs, though?
        Gram_Stone 26 Mar 2015 10:56 UTC
        2 points
        Parent
        For one, even a ‘tool’ could return a catastrophic solution that humans might unwittingly implement. Secondly, it’s conceivable that ‘tool AIs’ can ‘spontaneously agentize’, and you might as well try to build an agent on purpose for the sake of greater predictability and transparency. That is, as soon as you talk about leveraging ‘superintelligence’ rather than ‘intelligence’, you’re talking about software with qualitatively different algorithms; software that not only searches for solutions but goes about planning how to do it. (You might say, “Ah, but that’s where your mistake begins! We shouldn’t let it plan! That’s too agent-y!” Then it ceases to be superintelligence. Those are the cognitive tasks that we would be outsourcing.) It seems that at a certain point on a scale of intelligence, tool AIs move quickly from ‘not unprecedentedly useful’ to ‘just as dangerous as agents’, and thus are not worth pursuing.
        
        There’s a more nuanced approach to what I’ve said above. I’ve really never understood all of the fuss about whether we should use tools, oracles, genies, or sovereigns. The differences seem irrelevant. ‘Don’t design it such that it has goal-directed behavior,’ or ‘design it such that it must demonstrate solutions instead of performing them,’ or ‘design it such that it can only act on our command,’ seem like they’re in a similar class of mistake as ‘design the AI so that it values our happiness’ or some such; like it’s the sort of solution that you propose when you haven’t thought about the problem in enough technical detail and you’ve only talked about it in natural language. I’ve always thought of ‘agent’ as a term of convenience. Powerful optimization processes happen to produce effects similar to the effects produced by the things to which we refer when we discuss ‘agents’ in natural language. Natural language is convenient, but imprecise; ultimately, we’re talking about optimization processes in every case. Those are all ad hoc safety procedures. Far be it from me to speak for them, but I don’t interpret MIRI as advocating agents over everything else per se, so much as advocating formally verified optimization processes over optimization processes constrained by ad hoc safety procedures, and speaking of ‘agents’ is the most accurate way to state one facet of that advocacy in natural language.
        
        To summarize: The difference between tool AIs and agents is the difference between a human perceiving an optimization process in non-teleological and teleological terms, respectively. If the optimization process itself is provably safe, then the ad hoc safety procedures (‘no explicitly goal-directed behavior,’ ‘demonstrations only; no actions,’ etc.) will be unnecessary; if the optimization process is not safe, then the ad hoc safety procedures will be insufficient; given these points, conceiving of AGIs as tools is a distraction from other work.
        
        EDIT: I’ve been looking around since I wrote this, and I’m highly encouraged that Vladimir_Nesov and Eliezer have made similar points about tools, and Eliezer has also made a similar point about oracles. My point generalizes their points: Optimization power is what makes AGI useful and what makes it dangerous. Optimization processes hit low probability targets in large search spaces, and the target is a ‘goal.’ Tools aren’t AIs ‘without’ goals, as if that would mean anything; they’re AIs with implicit, unspecified goals. You’re not making them Not-Goal-Directed; you’re unnecessarily leaving the goals up for grabs.
        Kaj_Sotala 26 Mar 2015 19:59 UTC
        3 points
        Parent
        
        Basically time spent now on unbounded AIXI constructs is probably completely wasted. Real AGIs don’t have Solomonoff inductors or anything resembling them.
        
        I wouldn’t say that the time studying AIXI-like models is completely wasted, even if real AGIs turned out to have very little to do with AIXI. Even if AIXI approximation isn’t the way that actual AGI will be built, to the extent that the behavior of a rational agent resembles the model of AIXI, studying models of AIXI can still give hints of what need to be considered in AGI design. lukeprog and Bill Hibbard advanced this argument in Exploratory Engineering in AI:
        
        ...some experts think AIXI approximation isn’t a fruitful path toward human-level AI. Even if that’s true, AIXI is the first model of cross-domain intelligent behavior to be so completely and formally specified that we can use it to make formal arguments about the properties which would obtain in certain classes of hypothetical agents if we could build them today. Moreover, the formality of AIXI-like agents allows researchers to uncover potential safety problems with AI agents of increasingly general capability—problems which could be addressed by additional research, as happened in the field of computer security after Lampson’s article on the confinement problem.
        
        AIXI-like agents model a critical property of future AI systems: that they will need to explore and learn models of the world. This distinguishes AIXI-like agents from current systems that use predefined world models, or learn parameters of predefined world models. Existing verification techniques for autonomous agents (Fisher, Dennis, and Webster 2013) apply only to particular systems, and to avoiding unwanted optima in specific utility functions. In contrast, the problems described below apply to broad classes of agents, such as those that seek to maximize rewards from the environment.
        
        For example, in 2011 Mark Ring and Laurent Orseau analyzed some classes of AIXIlike agents to show that several kinds of advanced agents will maximize their rewards by taking direct control of their input stimuli (Ring and Orseau 2011). To understand what this means, recall the experiments of the 1950s in which rats could push a lever to activate a wire connected to the reward circuitry in their brains. The rats pressed the lever again and again, even to the exclusion of eating. Once the rats were given direct control of the input stimuli to their reward circuitry, they stopped bothering with more indirect ways of stimulating their reward circuitry, such as eating. Some humans also engage in this kind of “wireheading” behavior when they discover that they can directly modify the input stimuli to their brain’s reward circuitry by consuming addictive narcotics. What Ring and Orseau showed was that some classes of artificial agents will wirehead—that is, they will behave like drug addicts.
        
        Fortunately, there may be some ways to avoid the problem. In their 2011 paper, Ring and Orseau showed that some types of agents will resist wireheading. And in 2012, Bill Hibbard (2012) showed that the wireheading problem can also be avoided if three conditions are met: (1) the agent has some foreknowledge of a stochastic environment, (2) the agent uses a utility function instead of a reward function, and (3) we define the agent’s utility function in terms of its internal mental model of the environment. Hibbard’s solution was inspired by thinking about how humans solve the wireheading problem: we can stimulate the reward circuitry in our brains with drugs, yet most of us avoid this temptation because our models of the world tell us that drug addiction will change our motives in ways that are bad according to our current preferences.
        
        Relatedly, Daniel Dewey (2011) showed that in general, AIXI-like agents will locate and modify the parts of their environment that generate their rewards. For example, an agent dependent on rewards from human users will seek to replace those humans with a mechanism that gives rewards more reliably. As a potential solution to this problem, Dewey proposed a new class of agents called value learners, which can be designed to learn and satisfy any initially unknown preferences, so long as the agent’s designers provide it with an idea of what constitutes evidence about those preferences.
        
        Houshalter 23 Mar 2015 14:45 UTC
        3 points
        Parent
        I think a very important application of AI safety ideas is self driving cars. This is a domain where traditional AI methods aren’t straightforwardly applied. You can’t merely have an algorithm take in input and predict what a human would do. Otherwise it will just drive like it predicts a human would. You can’t have it get in accidents, so training data is limited.
        hairyfigment 23 Mar 2015 17:47 UTC
        0 points
        Parent
        As people have said at length, AIXI is not a solution even in principle. Hence MIRI’s work on an actual theory of AI and FAI. Speaking of, I’ve said this before, but I’ll state it now more starkly: your timeline seems as delusional to me as MIRI apparently seems to you.
        Mark_Friedenbach 23 Mar 2015 18:47 UTC
        4 points
        Parent
        
        Speaking of, I’ve said this before, but I’ll state it now more starkly: your timeline seems as delusional to me as MIRI apparently seems to you.
        
        That’s great, I’d love to engage with you on that. What timeline would you give higher probability to, and why?
        hairyfigment 23 Mar 2015 20:13 UTC
        −3 points
        Parent
        I roughly agree with Luke—that would be the director of MIRI—in placing the median close to 2070.
        Mark_Friedenbach 23 Mar 2015 20:23 UTC
        0 points
        Parent
        What about the second half of the question, why?
        Expand this thread
        hairyfigment 23 Mar 2015 23:14 UTC
        0 points
        Parent
        Seriously?
        
        Experience tells us to discount predictions of imminent AGI, to the point where only the strongest of reasons can overcome this. If AIXI represented a large enough increase in understanding of what we’re even talking about, that could be part of a strong argument. But as I said in the great-grandparent, it doesn’t.
        Mark_Friedenbach 25 Mar 2015 0:07 UTC
        4 points
        Parent
        Past predictive accuracy of expert opinions on the subject if AI superintelligence tells us nothing about what to infer from current predictions. If superintelligent AI were to actually arrive tomorrow, or 50 years from now, or 150 years from now, there would be no discernable difference in present expert opinion. On these sorts of subjects expert opinion is totally uncorrolated from reality. So no, experience tells us nothing about predictions of imminent or non-imminent AGI. We can thank our own Stuart Armstrong for this contribution.
        
        But hey, let’s take 2070 at face value. That’d be great news! We could completely forget about the existential threat due to unfriendly AI. After all, it’d be decades after even pessimistic estimates for whole-brain emulation[1] enables the first uploaded human intelligences. And a decade or so further after atomicly precise manufacturing[2] gives us the tools to do in-vivo[3] intelligence enhancement. By 2070 we’d already be in a world of human-derived superintellences, so thankfully we needn’t fret over our own biological limitations preventing us from keeping pace with superintelligent AI.
        
        Or is that not the future you imagined?
        
        http://www.fhi.ox.ac.uk/brain-emulation-roadmap-report.pdf
        https://www.foresight.org/roadmaps/Nanotech_Roadmap_2007_main.pdf
        http://www.nanomedicine.com/
Squark 19 Mar 2015 18:49 UTC
15 points
Great idea, well done!

However: Is it really the case that it’s impossible to login without Facebook? Why?
- orthonormal 19 Mar 2015 19:11 UTC
  3 points
  Parent
  Yes, for now. Authentication with Facebook (or Google) is one of the simpler methods for avoiding spambots.
  - joaolkf 21 Mar 2015 16:03 UTC
    15 points
    Parent
    I don’t currently have a facebook account and I know that a lot of very productive people here in Oxford that decided not to have one as well (e.g., Nick and Anders don’t have one). I think adding the option to authenticate via Google is an immediate necessity.
    - orthonormal 21 Mar 2015 23:41 UTC
      5 points
      Parent
      Developer time is a big bottleneck, but I agree with you that adding the option for Google authentication is a high priority.
  - Mark_Friedenbach 21 Mar 2015 16:48 UTC
    5 points
    Parent
    Some people don’t have facebook accounts, and would prefer not to.
Kaj_Sotala 23 Mar 2015 7:02 UTC
7 points
I’m not sure the current implementation of the tiered privileges system is optimal. For instance, after a link I posted got two likes, it become visible to everyone, but it looks like my reply to your comment isn’t visible if one isn’t logged in. I think that once a non-member link meets the necessary threshold for becoming visible to everyone, the poster’s replies to comments in the thread should become visible as well, or otherwise it’s just confusing.

Also, I feel that if new contributor comments are hidden in general, that might be a little too discouraging for new people if those comments also need to acquire two member likes in order to become visible.
- orthonormal 24 Mar 2015 1:01 UTC
  3 points
  Parent
  
  I think that once a non-member link meets the necessary threshold for becoming visible to everyone, the poster’s replies to comments in the thread should become visible as well, or otherwise it’s just confusing.
  
  That seems like a good feature request.
Evan_Gaensbauer 19 Mar 2015 15:56 UTC
6 points
As a supporter of effective altruism who’s interested in risks from superintelligence, where should I post now? What I mean by “interested” is someone who is sympathetic to the reasoning but skeptical about the tractability of the work MIRI and their allies do. For myself as well as on behalf of friends who share my concern, I would ask of MIRI what they think of the cause as a whole, how they think it will change in light of e.g., Elon Musk’s donation to FLI and growing publicity, and the Open Philanthropy Project’s investigation of the cause. I have one friend earning to give, a few acquaintances doing the same, and at least a couple friends my age in university still who may do so. As potential future donors and vocal supporters, provision of information would help each of us as individuals reach better conclusions. I wouldn’t demand Eliezer Yudkowsky or Luke Muehlhauser respond, really just someone affiliated with the organization. Could I expect something like that on LessWrong still, or has everyone working for MIRI disappeared from LessWrong forever to go to this new forum? My questions wouldn’t be on technical research, but I’d like to know where and how to address questions and concerns to the organization.
- orthonormal 19 Mar 2015 17:38 UTC
  10 points
  Parent
  The new forum will not cause people to disappear from Less Wrong, I expect. In fact, note the narrow focus of the IAFF:
  
  Examples of subjects that are off-topic (and which moderators might close down) include recent advances in artificial intelligence and machine learning, existential risks, effective altruism, human rationality, general mathematical logic, general philosophy of mind, and anything that cannot (yet) be usefully formalized or modeled mathematically.
  
  Less Wrong continues to be a great place to discuss those topics (which includes the topics you’re interested in). And as for the technical topics, I hope to make some posts on here for the more digestible forum work.
  - Evan_Gaensbauer 20 Mar 2015 1:01 UTC
    6 points
    Parent
    Thanks for the feedback!
Richard_Kennaway 19 Mar 2015 10:58 UTC
6 points
Does it have RSS feeds? I didn’t see any.
- Malo 19 Mar 2015 16:15 UTC
  10 points
  Parent
  It does: http://agentfoundations.org/rss
  
  The link to it is the last thing in the right sidebar. It says RSS in green.
- dottedmag 7 Apr 2015 5:36 UTC
  0 points
  Parent
  I have created a filtered feed for posts only: http://pipes.yahoo.com/pipes/pipe.run?_id=a52fa6d33e7bdc1ebac781f82bd68c51&_render=rss
ETranshumanist 19 Mar 2015 14:46 UTC
4 points
Your Link Throws a 502 error. Is the server down?
- Vladimir_Nesov 19 Mar 2015 14:52 UTC
  4 points
  Parent
  Same for me, and it was working half an hour ago.
  - Malo 19 Mar 2015 16:17 UTC
    11 points
    Parent
    The server was down, but it is back up again now.
    - ETranshumanist 19 Mar 2015 16:59 UTC
      3 points
      Parent
      Thank you!