sen comments on Which areas of rationality are underexplored? - Discussion Thread

sen 3 Dec 2016 7:50 UTC
13 points
Non-bayesian reasoning. Seriously, pretty much everything here is about experimentation, conditional probabilities, and logical fallacies, and all of the above are derived from bayesian reasoning. Yes, these things are important, but there’s more to science and modeling than learning to deal with uncertainty.

Take a look at the Wikipedia page on the Standard Model of particle physics, and count the number of times uncertainty and bayesian reasoning are mentioned. If your number is greater than zero, then they must have changed the page recently. Bayesian reasoning tells you what to expect given an existing set of beliefs. It doesn’t tell you how to develop those underlying beliefs in the first place. For much of physics, that’s pretty much squarely in the domain of group theory / symmetry. It’s ironic that a group so heavily based on the sciences doesn’t mention this at all.

Rationality is about more than empirical studies. It’s about developing sensible models of the world. It’s about conveying sensible models to people in ways that they’ll understand them. It’s about convincing people that your model is better than theirs, sometimes without having to do an experiment.

It’s not like these things aren’t well-studied. It’s called math, and it’s been studied for thousands of years. Everything on this site focuses on one tiny branch, and there’s so much more out there.

Apologies for the rant. This has been bugging me for a while now. I tried to create a thread on this a little while ago and met with the karma limitation. I didn’t want to deal with it at the time, and now it’s all coming back to me, rage and all.

Also, this discussion topic is suboptimal if your aim is to explore new areas of rationality, as it presumes that all unexplored areas will arise from direct discussion. It should have been paired with the question “How do we discover underexplored areas of rationality?” My answer is to that is to encourage non-rational discussion where people believe, intuitively or otherwise, that it should be possible to make the discussion rational. You’re not going to discover the boundaries of rationality by always staying within them. You need to look both outside and inside to see where the boundary might lie, and you need to understand non-rationality if you ever want hope of expanding the boundaries of rationality.

End rant.
- Johannes Treutlein 10 Jan 2017 13:22 UTC
  4 points
  Parent
  
  Rationality is about more than empirical studies. It’s about developing sensible models of the world. It’s about conveying sensible models to people in ways that they’ll understand them. It’s about convincing people that your model is better than theirs, sometimes without having to do an experiment.
  
  Hmm, I’m not sure I understand what you mean. Maybe I’m missing something? Isn’t this exactly what Bayesianism is about? Bayesianism is just using laws of probability theory to build an understanding of the world, given all the evidence that we encounter. Of course that’s at the core just plain math. E.g., when Albert Einstein thought of relativity, that was an insight without having done any experiment, but it is perfectly in accordance with Bayesianism.
  
  Bayesian probability theory seems to be all we need to find out truths about the universe. In this framework, we can explain stuff like “Occam’s Razor” in a formal way, and we can even include Popperian reasoning as a special case (a hypothesis has to condense probability mass on some of the outcomes in order to be useful. If you then receive evidence that would have been very unlikely given the hypothesis, we shift down the hypothesis’ probability a lot (=falsification). If we receive confirming evidence that could have been explained just as well by other theories, this only slightly upshifts our probability; see EY’s introduction.) But maybe this is not the point that you were trying to make?
  
  I also think that EY is not Bayesian sometimes. He often assigns something 100 per cent probability without any empirical evidence, but because simplicity and beauty of the theory. For example that MWI is correct interpretation of QM. But if you put 0 probability on something (other interpretations), it can’t be updated by any evidence.
  
  Hmm, I’m quite confident (not 100%) that he’s just assigning a very high probability to it, since it seems to be the way more parsimonious and computationally “shorter” explanation, but of course not 100% :) (see Occam’s razor link above for why Bayesians give shorter explanations more a priori credence.)
  
  Regarding Kuhnianism: Maybe it’s a good theory of how the social progress of science works, but how does it help me with having more accurate beliefs about the world? I don’t know much about it, so would be curious about relevant information! :)
- btrettel 8 Dec 2016 5:22 UTC
  2 points
  Parent
  Is there a single book or resource you would recommend for learning how group theory/symmetry can be used to develop theories and models?
  
  I work in fluid dynamics, and I’ve mainly seen group theory/symmetry mentioned when forming simplifying coordinate transformations. Fluid dynamicists call these “dimensionless parameters” or “similarity variables”. I am certain other fields use different terminology.
  - sen 9 Dec 2016 10:44 UTC
    2 points
    Parent
    See my response below to WhySpace on getting started with group theory through category theory. For any space-oriented field, I also recommend looking at the topological definition of a space. Also, for any calculus-heavy field, I recommend meditating on the Method of Lagrange Multipliers if you don’t already have a visual grasp of it.
    
    I don’t know of any resource that tackles the problem of developing models via group theory. Developing models is a problem of stating and applying analogies, which is a problem in category theory. If you want to understand that better, you can look through the various classifications of functors since the notion of a functor translates pretty accurately to “analogy”.
    
    I have no background in fluid dynamics, so please filter everything I say here through your own understanding, and please correct me if I’m wrong somewhere.
    
    I don’t think there’s any inherent relationship between dimensionless parameters and group theory. The reason being that dimensionless quantities can refer to too many things (i.e., they’re not really dimensionless, and different dimensionlessnesses have different properties… or rather they may be dimensionless, but they’re not typeless). Consider that the !∘sqrt∘ln of a dimensionless quantity is also technically a dimensionless quantity while also being almost-certainly useless and uninterpretable. I suppose if you can rewrite an equation in terms of dimensionless quantities whose relationships are restricted to have certain properties, then you can treat them like other well-known objects, and you can throw way more math at them.
    
    For example, suppose your “dimensionless” quantity is a scaling parameter such that scale * scale → scale (the product of two scaling operations is equivalent to a single scaling operation). By converting your values to scales, you’ve gained a new operation to work with due to not having to re-translate your quantities on each successive multiplication: element-wise exponentiation. I’d personally see that as a gateway to applying generating series (because who doesn’t love generating series?), but I guess a more mechanics-y application of that would be solving differential equations, which often require exponentiating things.
    
    Any time you have a set of X quantities that can be applied to one another to get another of the X quantities, you have a group of some sort (with some exceptions). That’s what’s going on with the scaling example (x * x → x), and that’s what’s not going on with the !∘sqrt∘ln example. The scaling example just happens to be a particularly simple example of a group. You get less trivial examples when you have multiple “dimensionless” quantities that can interact with one another in standard ways. For example, if vector addition, scaling, and dot products are sensible, your vectors can form a Hilbert space, and you can use wonderful things like angles and vector calculus to meaningful effect.
    
    I can probably give a better answer if I know more precisely what you’re referring to. Do you have examples of fluid dynamicists simplifying equations and citing group theory as the justification?
    - btrettel 10 Dec 2016 1:26 UTC
      5 points
      Parent
      Thanks for the detailed reply, sen. I don’t follow everything you said, but I’ll take a look at your recommendations and see after that.
      
      I can probably give a better answer if I know more precisely what you’re referring to. Do you have examples of fluid dynamicists simplifying equations and citing group theory as the justification?
      
      Unfortunately, the subject is rather disjoint. Most fluid dynamicists would have no idea that group theory is relevant. My impression is that some mathematicians have interpreted what fluid dynamicists have done for a long time in terms of group theory, and extended their methods. Fluid dynamicists call the approach “dimensional analysis” if you reduce the number of input parameters or “similarity analysis” if you reduce the number independent variables of a differential equation (more on the latter later)
      
      The goal generally is dimension reduction. For example, if you are to perform a simple factorial experiment with 3 variables and you want to sample 8 different values of each variable, you have 8^3 = 512 samples to make, and that’s not even considering doing multiple trials. But, if you can determine a coordinate transformation which reduces those 3 variables to 1, then you only have 8 samples to make.
      
      The Buckingham Pi theorem allows you to determine how many dimensionless variables are needed to fully specify the problem if you start with dimensional quantities. (If everything is dimensionless to begin with, there’s no benefit from this technique, but other techniques might have benefit.)
      
      For a long list of examples of the dimensionless quantities, see Wikipedia. The Reynolds number is the most well known of these. (Also, contrary to common understanding, the Reynolds number doesn’t really say anything about “how turbulent” a flow is, rather, it would be better thought of as a way to characterize instability of a flow. There are multiple ways to measure “how turbulent” a flow is.)
      
      For a “similarity variable”, I’m not sure what the best place to point them out would be. Here’s one example, though: If you take the 1D unbounded heat equation and change coordinates to \eta = x / \sqrt{\alpha t} (\alpha is the thermal diffusivity), you’ll find the PDE is reduced to an ODE, and solution should be much easier now. The derivation of the reduction to an ODE is not on Wikipedia, but it is very straightforward.
      
      Dimensional analysis is really only taught to engineers working on fluid mechanics and heat transfer. I am continually surprised by how few people are aware of it. It should be part of the undergraduate curriculum for any degree in physics. Statisticians, particularly those who work in experimental design, also should know it. Here’s an interesting video of a talk with an application of dimensional analysis to experimental design. As I recall, one of the questions asked after the talk related the approach to Lie groups.
      
      For an engineering viewpoint, I’d recommend Langhaar’s book. This book does not discuss similarity variables, however. For something bridging the more mathematical and engineering viewpoints I have one recommendation. I haven’t looked at this book, but it’s one of the few I could find which discusses both the Buckingham Pi theorem and Lie groups. For something purely on the group theory side, see Olver’s book.
      
      Anyhow, I asked about this because I get the impression from some physicists that there’s more to applications of group theory to building models than what I’ve seen.
      
      Consider that the !∘sqrt∘ln of a dimensionless quantity is also technically a dimensionless quantity while also being almost-certainly useless and uninterpretable.
      
      This is an important realization. The Buckingham Pi theorem doesn’t tell you which dimensionless variables are “valid” or “useful”, just the number of them needed to fully specify the problem. Whether or not a dimensionless number is “valid” or “useful” depends on what you are interested in.
      
      Edit: Fixed some typos.
      - sen 10 Dec 2016 13:02 UTC
        3 points
        Parent
        Regarding the Buckingham Pi Theorem (BPT), I think I can double my recommendation that you try to understand the Method of Lagrange Multipliers (MLM) visually. I’ll try to explain in the following paragraph knowing that it won’t make much sense on first reading.
        
        For the Method of Lagrange Multipliers, suppose you have some number of equations in n variables. Consider the n-dimensional space containing the set of all solutions to those equations. The set of solutions describes a k-dimensional manifold (meaning the surface of the manifold forms a k-dimensional space), where k depends on the number of independent equations you have. The set of all points perpendicular to this manifold (the null space, or the space of points that, projected onto the manifold, give the zero vector) can be described by an (n-k)-dimensional space. Any (n-k)-dimensional space can be generated (by vector scaling and vector addition) of (n-k) independent vectors. For the Buckingham Pi Theorem, replace each vector with a matrix/group, vector scaling with exponentiation, and vector addition with multiplication. Your Buckingham Pi exponents are Lagrange multipliers, and your Pi groups are Lagrange perpendicular vectors (the gradient/normal vectors of your constraints/dimensions).
        
        I guess in that sense, I can see why people would make the jump to Lie groups. The Pi Groups / basis vectors form the generator of any other vector in that dimensionless space, and they’re obviously invertible. Honestly, I haven’t spent much time with Lie Groups and Lie Algebra, so I can’t tell you why they’re useful. If my earlier explanation of dimensionless quantities holds (which, after seeing the Buckingham Pi Theorem, I’m even more convinced that it does), then it has something to do with symmetry with respect to scale, The reason I say “scale” as opposed to any other x * x → x quantity is that the scale kind of dimensionlessness seems to pop up in a lot of dimensionless quantities specific to fluid dynamics, including Reynold’s Number.
        
        Sorry, I know that didn’t make much sense. I’m pretty sure it will though once you go through the recommendations in my earlier reply.
        
        Regarding Reynold’s Number, I suspect you’re not going to see the difference between the dimensional and the dimensionless quantities until you try solving that differential equation at the bottom of the page. Try it both with and without converting to dimensionless quantities, and make sure to keep track of the semantics of each term as you go through the process. Here’s one that’s worked out for the dimensionless case. If you try solving it for the non-dimensionless case, you should see the problem.
        
        It’s getting really late. I’ll go through your comments on similarity variables in a later reply.
        
        Thanks for the references and your comments. I’ve learned a lot from this discussion.
        btrettel 10 Dec 2016 21:49 UTC
        0 points
        Parent
        Glad to help. I’ll go through your recommendations later this month when I have more time.
        [deleted] 12 Dec 2016 13:25 UTC
        1 point
        Parent
        Could you guys cooperate or something and write an intro Discussion or Main post on this for landlubbers? Pretty please?
        
        I have glanced at a very brief introductory article on dim.an. in regards to Reynold’s number when I wondered whether I could model dissemination of fern’s spores within a ribbon-shaped population, or just simply read about such model, but it all seemed like so much trouble. And even worse, I had a weird feeling like ‘oh this has to be so noisy, how do they even know how the errors are combined in these new parameters? Surely they don’t just sum.’
        
        (Um, a datapoint from a non-mathy person, I think I’m not alone in this.)
        btrettel 12 Dec 2016 21:16 UTC
        2 points
        Parent
        Sure, I’d be interested in writing an article on dimensional analysis and scaling in general. I might have time over my winter break. It’s also worth noting that I posted on dimensional analysis before. Dimensional analysis is not as popular as principal components analysis, despite being much easier, and I think this is unfortunate.
        
        I don’t know what a “ribbon-shaped population” is, but I imagine that fern spores are blown off by wind and then dispersed by a combination of wind and turbulence. Turbulent dispersion of particles is essentially an entire field by itself. I have some experience in it from modeling water droplet trajectories for fire suppression, so I might be able to help you more, assuming I understand your problem correctly. Feel free to send me a message on here if you’d like help.
        
        And even worse, I had a weird feeling like ‘oh this has to be so noisy, how do they even know how the errors are combined in these new parameters? Surely they don’t just sum.’
        
        Could you explain this a little more? I’m not exactly following.
        
        Because dimensional homogeneity is a requirement for physical models, any series of independent dimensionless variables you construct should be “correct” in a strict sense, but they are not unique, and consequently you might not naively pick “useful” variables. If this doesn’t make sense, then I could explain in more detail or differently.
        [deleted] 14 Dec 2016 16:02 UTC
        1 point
        Parent
        Yes, I remember that post. It was ‘almost interesting’ to me, because it is beyond my actual knowledge. So, if you could just maybe make it less scary, we landlubbers would love you to bits. If you’d like.
        
        I agree about the wind and the turbulence, which is somewhat “dampered” by the prolonged period of spore dissemination and the possibility (I don’t know how real) of re-dissemination of the ones that “didn’t stick” the first time. The thing I am (was) most interested in—how fertilization occurs in the new organisms growing from the spores—is further complicated by the motility of sperm and the relatively big window of opportunity (probably several seasons)… so I am not sure if modeling the dissemination has any value, but still. This part is at least above-ground. It’s really an example of looking for your keys under a lamplight.
        
        re: errors. I mean that it seemed to me (probably wrongly) that if you measure a bunch of variables, and try to make a model from them, then realise you only want a few and the others can be screwed together into a dimensionless ‘thing’, then how do you know the, well, ‘bounds of correctness’ of the dimensionless thing? It was built from imperfect measurements that carried errors in them; where do the errors go when you combine variables into something new? (I mean, it is a silly question, but i haz it.)
        
        (‘ribbon-shaped population’ was my clumsy way of describing a long and narrow, but relatively uninterrupted population of plants that stretches along a certain landscape feature, like a beach. I can’t recall the real word right now.)
        btrettel 15 Dec 2016 22:33 UTC
        2 points
        Parent
        Romashka, I appreciate the reply.
        
        Yes, I remember that post. It was ‘almost interesting’ to me, because it is beyond my actual knowledge. So, if you could just maybe make it less scary, we landlubbers would love you to bits. If you’d like.
        
        If you don’t mind, could you highlight which parts you thought were too difficult?
        
        Aside from adding more details, examples, and illustrations, I’m not sure what I could change. I will have to think about this more.
        
        re: errors. I mean that it seemed to me (probably wrongly) that if you measure a bunch of variables, and try to make a model from them, then realise you only want a few and the others can be screwed together into a dimensionless ‘thing’, then how do you know the, well, ‘bounds of correctness’ of the dimensionless thing? It was built from imperfect measurements that carried errors in them; where do the errors go when you combine variables into something new? (I mean, it is a silly question, but i haz it.)
        
        This is an important question to ask. After non-dimensionalizing the data and plotting it, if there aren’t large gaps in the coverage of any dimensionless independent variable, then you can just use the ranges of the dimensionless independent variables.
        
        I could add some plots showing this more obviously in a discussion post.
        
        Here are some example correlations from heat transfer. Engineers did heat transfer experiments in pipes and measured the heat flux as a function of different velocities. They then converted heat flux into the Nusselt number and the velocity/pipe diameter/viscosity into the Reynolds number, and had another term called the Prandtl number. There are plots of these experiments in the literature and you can see where the data for the correlation starts and ends. As you do not always have a clear idea of what happens outside the data (unless you have a theory), this usually is where the limits come from.
- TheAncientGeek 11 Dec 2016 13:00 UTC
  0 points
  Parent
  
  . Bayesian reasoning tells you what to expect given an existing set of beliefs. It doesn’t tell you how to develop those underlying beliefs in the first place
  
  That’s a very important point, and it is a pity that everyone decided to focus on the narrower point about physics.
  
  There’s a wider point, still, about ontological radicalism, doubling back, paradigm shifts and all that Kuhnian stuff that’s completely missed by emphasising Bayes, and thereby implying that everything is a linear stepwise refinement of models under evidence.
- WhySpace_duplicate0.9261692129075527 5 Dec 2016 17:07 UTC
  0 points
  Parent
  
  group theory / symmetry
  
  The Wikipedia page for group theory seems fairly impenetrable. Do you have a link you’d recommend as a good place to get one’s feet wet in the topic? Same with symmetry.
  
  Thanks!
  - sen 6 Dec 2016 6:19 UTC
    1 point
    Parent
    “Group” is a generalization of “symmetry” in the common sense.
    
    I can explain group theory pretty simply, but I’m going to suggest something else. Start with category theory. It is doable, and it will give you the magical ability of understanding many math pages on Wikipedia, or at least the hope of being able to understand them. I cannot overstate how large an advantage this gives you when trying to understand mathematical concepts. Also, I don’t believe starting with group theory will give you any advantage when trying to understand category theory, and you’re going to want to understand category theory if you’re interested in reasoning.
    
    When I was getting started with category theory, I went back and forth between several pages (Category Theory, Functor, Universal Property, Universal Object, Limits, Adjoint Functors, Monomorphism, Epimorphism). Here are some of the insights that made things click for me:
    
    An “object” in category theory corresponds to a set in set theory. If you’re a programmer, it’s easier to think of a single categorical object as a collection (class) of OOP objects. It’s also valid and occasionally useful to think of a single categorical object as a single OOP object (e.g., a collection of fields).
    A “morphism” in category theory corresponds to a function in set theory. If you think of a categorical object as a collection of OOP objects, then a morphism takes as input a single OOP object at a time.
    It’s perfectly valid for a diagram to contain the same categorical object twice. Diagrams only show relations, and it’s perfectly valid for an OOP object to be related to another OOP object of the same class. When looking at commutative diagrams that seem to contain the same categorical object twice, think of them as distinct categorical objects.
    Diagrams don’t only show relationships between OOP objects. They can also show relationships between categorical objects. For example, a diagram might state that there is a bijection between two categorical objects.
    You’re not always going to have a natural transformation between two functors of the same category.
    When trying to understand universal properties, the following mapping is useful (look at the diagrams on Wikipedia): A is the Platonic Form of Y, U is a fire that projects only some subset of the aspects of being like A.
    The duality between categorical objects and OOP objects is critical to understanding the difference between any diagram and its dual (reversed-morphisms). Recognizing this makes it much easier to understand limits and colimits.
    
    Once you understand these things, you’ll have the basic language down to understand group theory without much difficulty.
- ChristianKl 4 Dec 2016 9:42 UTC
  0 points
  Parent
  If you look at the recent posts Double Crux is not about Bayesian reasoning.
  
  Discussions about system 1 and system 2 and how to have the two in sync are not about Bayesian reasoning either.
  
  There are also many other topics that are not about Bayes.
  - sen 4 Dec 2016 11:35 UTC
    0 points
    Parent
    System 1 and 2 I don’t think are relevant since they’re not areas of rationality. It’s the difference between a design and an implementation. I don’t think this thread is about implementation optimizations, and I do see numerous threads on that topic.
    
    Regarding double crux, I actually don’t see that when I browse through the recent threads, even going back several pages. Through the site search, I was able to find another post that links to a November 29th thread, which I think is the one you’re talking about.
    
    Here’s an excerpt from that double crux thread.
    
    Ideally, B is a statement that is somewhat closer to reality than A—it’s more concrete, grounded, well-defined, discoverable, etc. It’s less about principles and summed-up, induced conclusions, and more of a glimpse into the structure that led to those conclusions.
    
    (It doesn’t have to be concrete and discoverable, though—often after finding B it’s productive to start over in search of a C, and then a D, and then an E, and so forth, until you end up with something you can research or run an experiment on).
    
    That’s not out of context. The entire game description and recommendations are written with the focal point of increasing precision and making beliefs more concrete.
    
    I want you to take the time to seriously consider whether you think I’m crazy for thinking that “increasing precision” and “making beliefs more concrete” could possibly be a bad thing when trying to understand how someone thinks. Think about what your gut reaction was when you read that. Think about what alternative there could be. Please don’t read on until you’re sure I’m just trolling so maybe you can see how screwed up this place this.
    
    How about doing the exact opposite? How about making things less precise? How about throwing away useless structure and making it easier to reason by analogy, thereby letting people expose the full brunt of their intuition and experience that really leads to their beliefs? How about making beliefs less concrete, and therefore more abstract, more general, and easier to see relationships in other domains?
    
    If you convince someone that A really might not lead to B and that there are n experiments you could use to tell, whoopee do, they are literally never going to use that again. If you discover that you believe uniforms lead to bullying because you mentally model social dynamics as particle systems, and bullying as a problem that occurs in high-chaos environments, and that uniforms go a long way in cooling the system thereby reducing the chaos and bullying… That’s probably going to stick with you for a while, despite being a complete ungrounded non-sequitur.
- turchin 3 Dec 2016 15:18 UTC
  0 points
  Parent
  I remember it was a poll in LW about how they use Bayes theorem in practical life (can’t find a link). There was only a few answer about actual practical usage. There is not much practical situations where it was useful.
  
  But it is good as a symbol of group membership and also in internet discussion.
  
  I also think that EY is not Bayesian sometimes. He often assigns something 100 per cent probability without any empirical evidence, but because simplicity and beauty of the theory. For example that MWI is correct interpretation of QM. But if you put 0 probability on something (other interpretations), it can’t be updated by any evidence. He also did it when he said that self-improving paper clip maximizer is the main risk of AI. But there are other risks of AI which are also deadly. (I counted around 100).
  - TheAncientGeek 11 Dec 2016 12:45 UTC
    2 points
    Parent
    
    But it is good as a symbol of group membership
    
    Is it good to think Bayes is this wonderful summum bonum of rationality, and not even notice how little use you yourself are making of it?
    
    also in internet discussion.
    
    Is it good to come across to someone with a pluralistic understanding of reasoning as a dogmatist?
    
    I also think that EY is not Bayesian sometimes.
    
    Elephant spotted.