V_V comments on LINK: AI Researcher Yann LeCun on AI function

V_V 12 Dec 2013 18:40 UTC
0 points

(a) Hidden Markov models and POMDPs are probabilistic models, not necessarily Bayesian.

According to Wikipedia:

A hidden Markov model (HMM) is a statistical Markov model in which the system being modeled is assumed to be a Markov process with unobserved (hidden) states. A HMM can be considered the simplest dynamic Bayesian network.

.

(b) I am using the standard definition of a causal model, first due to Neyman, popularized by Rubin. Everyone except some folks in the UK use this definition now. I am sorry if you are unfamiliar with it.

I suppose you mean this.

It seems to be a framework for the estimation of probability distributions from experimental data, under some independence assumptions.

(c) Statistical models cannot solve causal problems. The number of times you repeat the opposite, while adding the word “clearly” will not affect this fact.

You still didn’t define “causal problem” and what you mean by “solve” in this context.
- IlyaShpitser 12 Dec 2013 20:07 UTC
  2 points
  Parent
  A “Bayesian network” is not necessarily a Bayesian model. Bayesian networks can be used with frequentist methods, and frequently are (see: the PC algorithm). I believe Pearl called the networks “Bayesian” to honor Bayes, and because of the way Bayes theorem is used when you shuffle probabilities around. The model does not necessitate Bayesian methods at all.
  
  I don’t mean to be rude, but are we operating at the level of string pattern matching, and google searches here?
  
  You still didn’t define “causal problem” and what you mean by “solve” in this context.
  
  Sociological definition : “a causal problem” is a problem that people who do causal inference study. Estimating causal effects. Learning cause-effect relationships from data. Mediation analysis. Interference analysis. Decision theory problems. To “solve” means to get the right answer and thereby avoid going to jail for malpractice.
  
  This is a bizarre conversation. Causal problems aren’t something esoteric. Imagine if you kept insisting I define what an algebra problem is. There are all sorts of things you could read on this standard topic.
  - Lumifer 12 Dec 2013 20:37 UTC
    1 point
    Parent
    
    This is a bizarre conversation.
    
    Looks a like a perfectly normal conversation where people insist on using different terminology sets :-/
    - IlyaShpitser 12 Dec 2013 20:52 UTC
      0 points
      Parent
      One of these people has a good reason for preferring his terminology (e.g. it’s standard, it’s what everyone in the field actually uses, etc.) “Scott, can you define what a qubit is?”, etc.
      - Lumifer 12 Dec 2013 21:16 UTC
        0 points
        Parent
        
        it’s what everyone in the field actually uses
        
        Yes, but you are talking to people outside of the field.
        
        For example you tend to use the expression “prediction model” as an antonym to “causal model”. That may be standard in your field, but that’s not what it means outside of it.
        IlyaShpitser 12 Dec 2013 21:49 UTC
        0 points
        Parent
        
        For example you tend to use the expression “prediction model” as an antonym to “causal model”.
        
        Not an antonym, just a different thing that should not be confused. A qubit is a very different thing from a bit, with different properties.
        
        That may be standard in your field, but that’s not what it means outside of it.
        
        “Sure, this definition of truth may be standard in your field, Prof. Tarsky, but that’s not what we mean!” I guess we are done, then! Thanks for your time.
  - V_V 12 Dec 2013 22:36 UTC
    0 points
    Parent
    
    A “Bayesian network” is not necessarily a Bayesian model. Bayesian networks can be used with frequentist methods, and frequently are (see: the PC algorithm).
    
    You can use frequentists methods to learn Bayesian networks from data, as with any other Bayesian model.
    
    And you can also use Bayesian networks without priors to do things like maximum likelihood estimation, which isn’t Bayesian sensu stricto, but I don’t think this is relevant to this conversation, is it?
    
    I don’t mean to be rude, but are we operating at the level of string pattern matching, and google searches here?
    
    No, we are operating at the level of trying to make sense of your claims.
    
    Sociological definition : “a causal problem” is a problem that people who do causal inference study. Estimating causal effects. Learning cause-effect relationships from data. Mediation analysis. Interference analysis. Decision theory problems. To “solve” means to get the right answer and thereby avoid going to jail for malpractice.
    
    Please try to reformulate without using the word “cause/causal”.
    The term has multiple meanings. You may be using a one of them assuming that everybody shares it, but that’s not obvious.
    - IlyaShpitser 12 Dec 2013 23:00 UTC
      1 point
      Parent
      I operate within the interventionist school of causality, whereby a causal effect has something to do with how interventions affect outcome variables. This is of course not the only formalization of causality, there are many many others. However, this particular one has been very influential, almost universally adopted among the empirical sciences, corresponds very closely to people’s causal intuitions in many important respects (and has the mathematical machinery to move far beyond when intuitions fail), and has a number of other nice advantages I don’t have the space to get into here (for example it helped to completely crack open the “what’s the dimension of a hidden variable DAG” problem).
      
      One consequence of the conceptual success of the interventionist school is that there is now a long list of properties we think a formalization of causality has to satisfy (that were first figured out within the interventionist framework). So we can now rule out bad formalizations of causality fairly easily.
      
      I think getting into the interventionist school is too long for even a top level post, let alone a response post buried many levels deep in a thread. If you are interested, you can read a book about it (Pearl’s book for example), or some papers.
      
      Prediction algorithms, as used in ML today, completely fail on interventionist causal problems, which correspond, loosely speaking, to trying to figure out the effect of a randomized trial from observational data. I am not trying to give them a hard time about it, because that’s not what the emphasis in ML is, which is perfectly fine!
      
      You can think of this problem as just another type of “prediction problem,” but this word usage simply does not conform to what people in ML mean by “prediction.” There is an entirely different theory, etc.