Selling Nonapples
Previously in series: Worse Than Random
A tale of two architectures...
Once upon a time there was a man named Rodney Brooks, who could justly be called the King of Scruffy Robotics. (Sample paper titles: “Fast, Cheap, and Out of Control”, “Intelligence Without Reason”). Brooks invented the “subsumption architecture”—robotics based on many small modules, communicating asynchronously and without a central world-model or central planning, acting by reflex, responding to interrupts. The archetypal example is the insect-inspired robot that lifts its leg higher when the leg encounters an obstacle—it doesn’t model the obstacle, or plan how to go around it; it just lifts its leg higher.
In Brooks’s paradigm—which he labeled nouvelle AI—intelligence emerges from “situatedness”. One speaks not of an intelligent system, but rather the intelligence that emerges from the interaction of the system and the environment.
And Brooks wrote a programming language, the behavior language, to help roboticists build systems in his paradigmatic subsumption architecture—a language that includes facilities for asynchronous communication in networks of reflexive components, and programming finite state machines.
My understanding is that, while there are still people in the world who speak with reverence of Brooks’s subsumption architecture, it’s not used much in commercial systems on account of being nearly impossible to program.
Once you start stacking all these modules together, it becomes more and more difficult for the programmer to decide that, yes, an asynchronous local module which raises the robotic leg higher when it detects a block, and meanwhile sends asynchronous signal X to module Y, will indeed produce effective behavior as the outcome of the whole intertwined system whereby intelligence emerges from interaction with the environment...
Asynchronous parallel decentralized programs are harder to write. And it’s not that they’re a better, higher form of sorcery that only a few exceptional magi can use. It’s more like the difference between the two business plans, “sell apples” and “sell nonapples”.
One noteworthy critic of Brooks’s paradigm in general, and subsumption architecture in particular, is a fellow by the name of Sebastian Thrun.
You may recall the 2005 DARPA Grand Challenge for the driverless cars. How many ways was this a fair challenge according to the tenets of Scruffydom? Let us count the ways:
The challenge took place in the real world, where sensors are imperfect, random factors intervene, and macroscopic physics is only approximately lawful.
The challenge took place outside the laboratory—not even on paved roads, but 212km of desert.
The challenge took place in real time—continuous perception, continuous action, using only computing power that would fit on a car.
The teams weren’t told the specific race course until 2 hours before the race.
You could write the code any way you pleased, so long as it worked.
The challenge was competitive: The prize went to the fastest team that completed the race. Any team which, for ideological reasons, preferred elegance to speed—any team which refused to milk every bit of performance out of their systems—would surely lose to a less principled competitor.
And the winning team was Stanley, the Stanford robot, built by a team led by Sebastian Thrun.
How did he do it? If I recall correctly, Thrun said that the key was being able to integrate probabilistic information from many different sensors, using a common representation of uncertainty. This is likely code for “we used Bayesian methods”, at least if “Bayesian methods” is taken to include algorithms like particle filtering.
And to heavily paraphrase and summarize some of Thrun’s criticisms of Brooks’s subsumption architecture:
Robotics becomes pointlessly difficult if, for some odd reason, you insist that there be no central model and no central planning.
Integrating data from multiple uncertain sensors is a lot easier if you have a common probabilistic representation. Likewise, there are many potential tasks in robotics—in situations as simple as navigating a hallway—when you can end up in two possible situations that look highly similar and have to be distinguished by reasoning about the history of the trajectory.
To be fair, it’s not as if the subsumption architecture has never made money. Rodney Brooks is the founder of iRobot, and I understand that the Roomba uses the subsumption architecture. The Roomba has no doubt made more money than was won in the DARPA Grand Challenge… though the Roomba might not seem quite as impressive...
But that’s not quite today’s point.
Earlier in his career, Sebastian Thrun also wrote a programming language for roboticists. Thrun’s language was named CES, which stands for C++ for Embedded Systems.
CES is a language extension for C++. Its types include probability distributions, which makes it easy for programmers to manipulate and combine multiple sources of uncertain information. And for differentiable variables—including probabilities—the language enables automatic optimization using techniques like gradient descent. Programmers can declare ‘gaps’ in the code to be filled in by training cases: “Write me this function.”
As a result, Thrun was able to write a small, corridor-navigating mail-delivery robot using 137 lines of code, and this robot required less than 2 hours of training. As Thrun notes, “Comparable systems usually require at least two orders of magnitude more code and are considerably more difficult to implement.” Similarly, a 5,000-line robot localization algorithm was reimplemented in 52 lines.
Why can’t you get that kind of productivity with the subsumption architecture? Scruffies, ideologically speaking, are supposed to believe in learning—it’s only those evil logical Neats who try to program everything into their AIs in advance. Then why does the subsumption architecture require so much sweat and tears from its programmers?
Suppose that you’re trying to build a wagon out of wood, and unfortunately, the wagon has a problem, which is that it keeps catching on fire. Suddenly, one of the wagon-workers drops his wooden beam. His face lights up. “I have it!” he says. “We need to build this wagon from nonwood materials!”
You stare at him for a bit, trying to get over the shock of the new idea; finally you ask, “What kind of nonwood materials?”
The wagoneer hardly hears you. “Of course!” he shouts. “It’s all so obvious in retrospect! Wood is simply the wrong material for building wagons! This is the dawn of a new era—the nonwood era—of wheels, axles, carts all made from nonwood! Not only that, instead of taking apples to market, we’ll take nonapples! There’s a huge market for nonapples—people buy far more nonapples than apples—we should have no trouble selling them! It will be the era of the nouvelle wagon!”
The set “apples” is much narrower than the set “not apples”. Apples form a compact cluster in thingspace, but nonapples vary much more widely in price, and size, and use. When you say to build a wagon using “wood”, you’re giving much more concrete advice than when you say “not wood”. There are different kinds of wood, of course—but even so, when you say “wood”, you’ve narrowed down the range of possible building materials a whole lot more than when you say “not wood”.
In the same fashion, “asynchronous”—literally “not synchronous”—is a much larger design space than “synchronous”. If one considers the space of all communicating processes, then synchrony is a very strong constraint on those processes. If you toss out synchrony, then you have to pick some other method for preventing communicating processes from stepping on each other—synchrony is one way of doing that, a specific answer to the question.
Likewise “parallel processing” is a much huger design space than “serial processing”, because serial processing is just a special case of parallel processing where the number of processors happens to be equal to 1. “Parallel processing” reopens all sorts of design choices that are premade in serial processing. When you say “parallel”, it’s like stepping out of a small cottage, into a vast and echoing country. You have to stand someplace specific, in that country—you can’t stand in the whole place, in the noncottage.
So when you stand up and shout: “Aha! I’ve got it! We’ve got to solve this problem using asynchronous processes!”, it’s like shouting, “Aha! I’ve got it! We need to build this wagon out of nonwood! Let’s go down to the market and buy a ton of nonwood from the nonwood shop!” You’ve got to choose some specific alternative to synchrony.
Now it may well be that there are other building materials in the universe than wood. It may well be that wood is not the best building material. But you still have to come up with some specific thing to use in its place, like iron. “Nonwood” is not a building material, “sell nonapples” is not a business strategy, and “asynchronous” is not a programming architecture.
And this is strongly reminiscent of—arguably a special case of—the dilemma of inductive bias. There’s a tradeoff between the strength of the assumptions you make, and how fast you learn. If you make stronger assumptions, you can learn faster when the environment matches those assumptions well, but you’ll learn correspondingly more slowly if the environment matches those assumptions poorly. If you make an assumption that lets you learn faster in one environment, it must always perform more poorly in some other environment. Such laws are known as the “no-free-lunch” theorems, and the reason they don’t prohibit intelligence entirely is that the real universe is a low-entropy special case.
Programmers have a phrase called the “Turing Tarpit”; it describes a situation where everything is possible, but nothing is easy. A Universal Turing Machine can simulate any possible computer, but only at an immense expense in time and memory. If you program in a high-level language like Python, then—while most programming tasks become much simpler—you may occasionally find yourself banging up against the walls imposed by the programming language; sometimes Python won’t let you do certain things. If you program directly in machine language, raw 1s and 0s, there are no constraints; you can do anything that can possibly be done by the computer chip; and it will probably take you around a thousand times as much time to get anything done. You have to do, all by yourself, everything that a compiler would normally do on your behalf.
Usually, when you adopt a program architecture, that choice takes work off your hands. If I use a standard container library—lists and arrays and hashtables—then I don’t need to decide how to implement a hashtable, because that choice has already been made for me.
Adopting the subsumption paradigm means losing order, instead of gaining it. The subsumption architecture is not-synchronous, not-serial, and not-centralized. It’s also not-knowledge-modelling and not-planning.
This absence of solution implies an immense design space, and it requires a correspondingly immense amount of work by the programmers to reimpose order. Under the subsumption architecture, it’s the programmer who decides to add an asynchronous local module which detects whether a robotic leg is blocked, and raises it higher. It’s the programmer who has to make sure that this behavior plus other module behaviors all add up to an (ideologically correct) emergent intelligence. The lost structure is not replaced. You just get tossed into the Turing Tarpit, the space of all other possible programs.
On the other hand, CES creates order; it adds the structure of probability distributions and gradient optimization. This narrowing of the design space takes so much work off your hands that you can write a learning robot in 137 lines (at least if you happen to be Sebastian Thrun).
The moral:
Quite a few AI architectures aren’t.
If you want to generalize, quite a lot of policies aren’t.
They aren’t choices. They’re just protests.
Added: Robin Hanson says, “Economists have to face this in spades. So many people say standard econ has failed and the solution is to do the opposite—non-equilibrium instead of equilibrium, non-selfish instead of selfish, non-individual instead of individual, etc.” It seems that selling nonapples is a full-blown Standard Iconoclast Failure Mode.
- Reversed Stupidity Is Not Intelligence by 12 Dec 2007 22:14 UTC; 184 points) (
- Selling Nonapples by 13 Nov 2008 20:10 UTC; 76 points) (
- The Nature of Logic by 15 Nov 2008 6:20 UTC; 42 points) (
- What Else Would I Do To Make a Living? by 2 Mar 2011 20:09 UTC; 22 points) (
- 17 Sep 2012 4:55 UTC; 18 points) 's comment on Open Thread, September 15-30, 2012 by (
- 17 Oct 2010 6:01 UTC; 17 points) 's comment on Mixed strategy Nash equilibrium by (
- 23 May 2010 4:40 UTC; 15 points) 's comment on Link: Strong Inference by (
- 1 Mar 2010 20:22 UTC; 15 points) 's comment on Rationality quotes: March 2010 by (
- The Reductionist Trap by 9 Aug 2021 17:00 UTC; 11 points) (
- 12 Sep 2012 19:52 UTC; 7 points) 's comment on Rationality Quotes September 2012 by (
- [SEQ RERUN] Selling Nonapples by 25 Oct 2012 4:58 UTC; 6 points) (
- 8 Nov 2010 18:52 UTC; 6 points) 's comment on An Xtranormal Intelligence Explosion by (
- 3 Nov 2010 8:58 UTC; 6 points) 's comment on Waser’s 3 Goals of Morality by (
- 20 Sep 2022 18:13 UTC; 6 points) 's comment on Features and Antifeatures by (
- 28 Sep 2010 4:47 UTC; 6 points) 's comment on Open Thread, September, 2010-- part 2 by (
- What is complexity science? (Not computational complexity theory) How useful is it? What areas is it related to? by 26 Sep 2020 9:15 UTC; 6 points) (
- 13 Jun 2014 7:58 UTC; 6 points) 's comment on Open thread, 9-15 June 2014 by (
- 8 Dec 2015 5:48 UTC; 5 points) 's comment on Engineering Religion by (
- 14 Aug 2011 17:34 UTC; 5 points) 's comment on Take heed, for it is a trap by (
- 29 Dec 2014 19:29 UTC; 5 points) 's comment on Open thread, Dec. 29, 2014 - Jan 04, 2015 by (
- 13 Sep 2013 11:41 UTC; 4 points) 's comment on Mistakes repository by (
- 26 Mar 2015 18:29 UTC; 4 points) 's comment on An Introduction to Löb’s Theorem in MIRI Research by (
- 6 Mar 2022 13:34 UTC; 3 points) 's comment on A Bayesian Aggregation Paradox by (
- A heuristic for predicting minor depression in others and myself, and related things by 12 May 2015 8:24 UTC; 3 points) (
- 1 Dec 2009 18:17 UTC; 2 points) 's comment on Open Thread: December 2009 by (
- 11 Jun 2022 11:12 UTC; 2 points) 's comment on why assume AGIs will optimize for fixed goals? by (
- 14 Jun 2010 13:51 UTC; 2 points) 's comment on Open Thread June 2010, Part 3 by (
- 4 Feb 2015 9:43 UTC; 2 points) 's comment on Behavior: The Control of Perception by (
- 11 Dec 2010 0:11 UTC; 2 points) 's comment on Rational entertainment industry? by (
- 19 Dec 2012 0:14 UTC; 1 point) 's comment on Torture vs. Dust Specks by (
- 2 Dec 2014 19:28 UTC; 1 point) 's comment on Open thread, Nov. 17 - Nov. 23, 2014 by (
- 23 Oct 2009 2:34 UTC; 1 point) 's comment on Rationality Quotes: October 2009 by (
- 31 Dec 2019 19:40 UTC; 1 point) 's comment on Debunking Fallacies in the Theory of AI Motivation by (
- 5 Apr 2011 3:44 UTC; 0 points) 's comment on Bayesianism versus Critical Rationalism by (
- 5 Oct 2012 21:29 UTC; 0 points) 's comment on Correspondence Bias by (
- 11 Jul 2012 13:49 UTC; 0 points) 's comment on Stupid Questions Open Thread Round 3 by (
- 25 Oct 2010 21:04 UTC; 0 points) 's comment on Church: a language for probabilistic modeling by (
- 12 Nov 2009 19:07 UTC; 0 points) 's comment on Less Wrong Q&A with Eliezer Yudkowsky: Ask Your Questions by (
- 14 Dec 2012 14:39 UTC; 0 points) 's comment on By Which It May Be Judged by (
I think that the reason that I find Brooks’ ideas interesting is because it seems to mirror the way that natural intelligences came about.
Biological evolution seems to amount to nothing more than local systems adapting to survive in and environment, and then aggregating into more complex systems. We know that this strategy has produced intelligence at least once in the history of the universe, and thus is seems to me a productive example to follow in attempting to create artificial intelligence as well.
Now, I don’t know what the state of the art is for the emergent AI school of thought is at the moment, but isn’t it possible that the challenge isn’t solving each of the little problems that feedback loops can help overcome, but rather enfolding the lessons learned by these simple systems into more complex aggregate systems?
That being said, you may be right, it may be easier (at this point) to program AI systems to narrow their search field with information about probability distributions and so forth, but could it not be that this strategy is fundamentally limited in the same way that expert systems are limited? That is, the system is only as “smart” as the knowledge base (or probability distributions) allow it to become, and they fail as “general” AI?
Do not copy the Blind Idiot God, for it lives much longer than you, and is a blind idiot.
I hope I live to see a world where synchronous computing is considered a quaint artifact of the dawn of computers. cognitive bias has prevented us from seeing the full of extent of what can be done with this computing thing. a limit on feasible computability (limited by our own brain capacity) that has existed for all the millions of years, shaping the way we assume we can solve problems in our world, is suddenly gone. we’ve made remarkable progress in a short time, I can’t wait to see what happens next.
Was the darpa grand challenge winner written using CES or a successor? I see no mention of it in the Darpa paper.
If not, why not? Perhaps neither of these approaches are good in the real world.
I am also guilty of wanting to toss people back to the Turing Tarpit to get to AI, but I don’t advocate staying there for long. I just think we have the wrong foundation for resource management and have to redo security and resource allocation at the architectural level. Then rebuild in a more adaptive system from there. I have a few ideas and they do have a fair amount of centralized modeling. But those methods of centralized modeling should be able to be changed in extremis, if what we thought about the world was wrong.
Think about it this way, you advocate sometimes radically rethinking your fundamental ways of conceiving the world, should we not allow our AI systems to be able to sometimes do the same, rather than constrain them with our preconceptions for eternity?
Economists have to face this in spades. So many people say standard econ has failed and the solution is to do the opposite—non-equilibrium instead of equilibrium, non-selfish instead of selfish, non-individual instead of individual, etc. And of course as you point out the problem is that just doesn’t say very much.
Tilden is another roboticist who’s gotten rich and famous off of unintelligent robots: BEAM robotics
Robin, if you say that’s true in economics too, then this is probably a full-blown Standard Iconoclast Failure Mode.
I wonder if the situation in computer programming is just an especially good illustration of it, because the programmer actually does have to reimpose order somehow afterward—you get to see the structure lost, the tarpit, and the effort. Brooks wrote real programs in his expanded design space and even made a buck off it, so we should much more strongly criticize someone who merely advocates “non-equilibrium economics” without saying which kind of disequilibrium.
My understanding is that, while there are still people in the world who speak with reverence of Brooks’s subsumption architecture, it’s not used much in commercial systems on account of being nearly impossible to program.
I once asked one of the robotics guys at IDSIA about subsumption architecture (he ran the German team that won the robo-soccer world cup a few years back) and his reply was that people like it because it works really well and is the simplist way to program many things. At the time, all of the top teams used it as far as he knew.
(p.s. don’t expect follow up replies on this topic from me as I’m current in the middle of nowhere using semi-functional dial-up...)
blinks at Shane
Okay, I’ll place my data in a state of expected-instability-pending-further-evidence. This doesn’t match what I’ve heard/found in my own meager investigations. Or maybe it works for a Roomba but not automated vehicles.
I don’t get this post. There is no big mystery to asynchronous communication—a process looks for messages whenever it is convenient for it to do so, very much like we check our mail-boxes when it is convenient for us. Although it is not clear to me how asynchronous communication helps in building an AI, I don’t see any underspecification here. And if people (including Brooks) have actually used the architecture for building robots, that at least must be clear proof that there is a real architecture here.
Btw, from my understanding, Thrun’s team made heavy use of supervised learning—the same paradigm that Eliezer knocked down as being unFriendly in his AI risks paper.
I DO get this post—I understand, and agree with the general concept, but I think Venu has a point that asynchronous programming is a bad example… although it LITERALLY means only “non-synchronous”, in practice it refers to a pretty specific type of alternative programming methodology… much more particular than just the set of all programming methodologies that aren’t synchronous.
Demanding nonapples is the standard response of voters to the failure of the governing Apple party.
Hi, I’m a PC; would you like to buy a non-Apple Dell, Lenovo, or HP?
I think subsumption is still popular amongst students and hobbyists.
It does raise an interesting “mini Friendliness” issue… I’m not really comfortable with the idea of subsumption software systems driving cars around on the road. Robots taking over the world may seem silly to most of the public but there are definite decisions to be made soon about what criteria we should use to trust complex software systems that make potentially deadly decisions. So far I think there’s a sense that the rules robots should use are completely clear, the situations enumerable—because of the extreme narrowness of the task domains. As tasks become more open ended, that may not be true for much longer.
@ Venu: Modern AI efforts are so far from human-level competence, that Friendly vs. Unfriendly doesn’t really matter yet. Eliezer is concerned about having a Friendly foundation for the coming Singularity, which starts with human-level AIs. A fairly stupid program (compared to humans) that merely drives a car, just doesn’t have the power to be a risk in the sense that Eliezer worries about.
It could still kill people if it’s not programmed correctly.
This seems like a good reason to understand the program well.
@Don: Eliezer says in his AI risks paper , criticising Bill Hibbard, that one cannot use supervised learning to specify the goal system for an AI. And although he doesn’t say this in the AI risks paper (contra what I said in my previous comment), I remember him saying somewhere (was it in a mailing list?) that supervised learning as such is not a reliable component to include in a Friendly AI. (I may be wrong in attributing this to him however.) I feel this criticism is misguided as any viable proposal for (Friendly or not) AI will have to be built out of modules which are not smart enough to be Friendly themselves. And supervised learning sure seems like a handy module to have—it clusters highly variable lower level sensory input into more stable higher level objects, and its usefulness has been demonstrated by the heavy use of it by Thrun’s team.
Isn’t supervised learning the current method of achieving friendly natural intelligence?
(Most insane psychopaths had bad parents, didn’t they?)
Yes, because we get to start with a prefabricated Friendliness-compatible architecture.
Probably yes, but that doesn’t distinguish “bad parenting” from “psychopath genes”.
Non-insane psychopaths also exist (e.g. a significant fraction of wealthy businessmen), and while I don’t have data on their childhood experiences, it seems pretty likely that they did not grow up in abusive or neglectful environments.
(I agree with your point about Friendliness-compatible architecture, though.)
Is there actually evidence for this?
Did you follow the link I gave right there in the part you quoted?
Yes, the link is to a blurb for a sensationalist book on Amazon. That’s not evidence.
You don’t happen to believe everything you read, do you?
No, but I did happen to read the citations in the back of the book. (Unfortunately, I borrowed the book from the library, so if you want me to post said citations, you’ll have to wait until this Thursday.)
It’s not that great of a book, on the whole (from what I remember of it, the author spends some time talking about Scientology), but the information about psychopathy, at least, mostly appears accurate.
Here’s another link, which points to quite a body of research: http://bud-meyers.blogspot.com/2012/03/study-10-on-wall-street-are-psychopaths.html
Sorry, don’t see a “body of research”. I see a lot of handwaving, name-calling, and righteous indignation.
Specific numbers? Diagnostic criteria by which “a significant fraction of wealthy businessmen” was declared to be sociopaths by their psychotherapists or psychiatrists?
[....] The psychopathic businessman In the past decade, the topic of psychopathy in business settings has similarly attracted increasing attention. Although such influential authors as Hervey Cleckley, David Lykken, Paul Babiak and Robert Hare have described vivid case examples of ruthless but prosperous businessmen who exhibited marked features of psychopathy, formal research on the implications of psychopathy in the workplace has been lacking – until recently.Recent work indicates that psychopathy is related to the use of hard negotiation tactics (e.g. threats of punishment: Jonason et al., 2012), bullying (Boddy, 2011), counterproductive workplace behaviour (e.g. theft by employees: O’Boyle et al., 2011), and poor management skills (Babiak et al., 2010). Although these results suggest that psychopathy has a marked ‘dark side’ in the workplace, there may be more to the story. Some authors have speculated that some psychopathic traits, such as charisma and interpersonal dominance, may contribute to effective leadership and management, at least in the short term (Babiak & Hare, 2006; Boddy et al., 2010; Furnham, 2007). Nevertheless, questions remain regarding the long-term effectiveness of such traits, with some suspecting that psychopathic traits tend eventually to be destructive.Recent research tentatively supports the view that psychopathy can be a double-edged sword in business settings. For example, data using the PCL-R show that psychopathic individuals are viewed as good communicators, strategic thinkers and innovators in the workplace (Babiak et al., 2010). More recently, unpublished research from our own lab has further elucidated the potential dual implications of psychopathy for workplace behaviour and leadership. In a sample of 312 North American community members, subdimensions of psychopathy, as measured by the PPI-R, were differentially related to leadership styles and counterproductive workplace behaviour. Specifically, Fearless Dominance was positively associated with adaptive leadership styles and minimally related to counterproductive workplace behaviour and maladaptive leadership styles. In contrast, Self-Centered Impulsivity was positively related to counterproductive workplace behaviour and negatively associated with adaptive leadership styles. In addition, individuals high on Fearless Dominance held more leadership positions over their lifetime than did other individuals.Although preliminary, these findings raise intriguing questions about the varied implications of psychopathic traits in the business world. Charisma, fearlessness, and willingness to take calculated business risks may predispose to business and leadership success. In contrast, certain features associated with psychopathy, such as impulsivity and lack of empathy, may do the opposite. [......] References
Babiak, P. & Hare, R.D. (2006). Snakes in suits: When psychopaths go to work. New York: HarperCollins. Babiak, P., Neumann, C.S. & Hare, R.D. (2010). Corporate psychopathy: Talking the walk. Behavioral Sciences & the Law, 28(2), 174–193. Benning, S.D., Patrick, C.J., Hicks, B.M. et al. (2003). Factor structure of the psychopathic personality inventory. Psychological Assessment, 15, 340–350. Boddy, C.R. (2011). The corporate psychopaths theory of the global financial crisis. Journal of Business Ethics, 102(2), 255–259. Boddy, C.R., Ladyshewsky, R.K. & Galvin, P. (2010). The influence of corporate psychopaths on corporate social responsibility and organizational commitment to employees. Journal of Business Ethics, 97(1), 1–19. Brodie. F.M. (1967). The devil drives: A life of Sir Richard Burton. New York: Norton. Burton, I. (1893). The life of Captain Sir Richard F. Burton KCMG, FRGS (Vols. 1 and 2). London: Chapman and Hall. Cale, E.M. & Lilienfeld, S.O. (2002). Histrionic personality disorder and antisocial personality disorder: Sex-differentiated manifestations of psychopathy? Journal of Personality Disorders, 16(1), 52–72. Cleckley, H. (1982). The mask of sanity (6th edn). St Louis, MO: Mosby. (First published 1941) Connelly, B.S. & Ones, D.S. (2010). An other perspective on personality: Meta-analytic integration of observers’ accuracy and predictive validity. Psychological Bulletin, 136(6), 1092. Falkenbach, D. & Tsoukalas, M. (2011, May). Can adaptive traits be observed in hero populations? Poster presented at the Biennial Meeting of the Society for the Scientific Study of Psychopathy, Montreal, Canada. Fowles, D.C. & Dindo, L. (2009). Temperament and psychopathy. Current Directions in Psychological Science, 18(3), 179–183. Furnham, A. (2007). Personality disorders and derailment at work: The paradoxical positive in?uence of pathology in the workplace. In J. Langan-Fox et al. (Eds.) Research companion to the dysfunctional workplace. Northhampton, MA: Edward Elgar. Goodwin, D.K. (2013). The bully pulpit. New York: Simon & Schuster. Hall, J.R. & Benning, S.D. (2006). The ‘successful’ psychopath. In C.J. Patrick (Ed.) Handbook of psychopathy (pp.459–475). New York: Guilford. Hare, R.D. (1991). The Hare Psychopathy Checklist-Revised. Toronto: Multi-Health Systems. Hicks, B.M., Markon, K.E., Patrick, C.J., et al. (2004). Identifying psychopathy subtypes on the basis of personality structure. Psychological assessment, 16(3), 276–288. Hogan, R., Raskin, R. & Fazzini, D. (1990). The dark side of leadership. In K.E. Clark & M.B. Clark (Eds.) Measures of leadership. West Orange, NJ: Leadership Library of America. Jonason, P.K., Slomski, S. & Partyka, J. (2012). The dark triad at work: How toxic employees get their way. Personality and Individual Differences, 52(3), 449–453. Judge, T.A. & LePine, J.A. (2007). 20 The bright and dark sides of personality: Implications for personnel selection in individual and team contexts. In J. Langan-Fox et al. (Eds.) Research companion to the dysfunctional workplace. Northhampton, MA: Edward Elgar. Judge, T.A., Piccolo, R.F. & Kosalka, T. (2009). The bright and dark sides of leader traits. The Leadership Quarterly, 20(6), 855–875. Kiehl, K. & Lushing, J. (2014). Psychopathy. In Scholarpedia. Retrieved from http://scholarpedia.org/article/psychopathy Lilienfeld, S.O., Patrick, C.J., Benning, S.D. et al. (2012). The role of fearless dominance in psychopathy: Confusions, controversies, and clarifications. Personality Disorders: Theory, Research, and Treatment, 3(3), 327–340. Lilienfeld, S.O., Waldman, I.D., Landfield, K. et al. (2012). Fearless dominance and the US presidency. Journal of Personality and Social Psychology, 103(3), 489–505. Lilienfeld, S.O. & Widows, M.R. (2005). Psychopathic Personality Inventory – Revised: Professional manual. Lutz, FL: Psychological Assessment Resources. Lykken, D.T. (1982, September). Fearlessness: Its carefree charm and deadly risks. Psychology Today, pp.20–28. Lykken, D.T. (1995). The antisocial personalities. Hillsdale, NJ: Erlbaum. Lykken, D.T. (2006). Psychopathic personality: The scope of the problem. In C.J. Patrick (Ed.) Handbook of psychopathy. New York: Guilford. Miller, J.D., Jones, S.E. & Lynam, D.R. (2011). Psychopathic traits from the perspective of self and informant reports: Is there evidence for a lack of insight? Journal of abnormal psychology, 120(3), 758–764. Miller, J.D. & Lynam, D.R. (2012). An examination of the Psychopathic Personality Inventory’s nomological network. Personality Disorders: Theory, Research, and Treatment, 3, 305–326. Miller, J.D., Rauscher, S. Hyatt, C.S. et al. (2013). Examining the relations among pain tolerance, psychopathic traits, and violent and non-violent antisocial behavior. Journal of Abnormal Psychology. doi:10.1037/a0035072 Neumann, C.S., Malterer, M.B. & Newman, J.P. (2008). Factor structure of the Psychopathic Personality Inventory (PPI): Findings from a large incarcerated sample. Psychological Assessment, 20, 169–174. O’Boyle, E.H., Jr, Forsyth, D.R., Banks, G.C. & McDaniel, M.A. (2012). A meta-analysis of the dark triad and work behavior: A social exchange perspective. Journal of Applied Psychology, 97(3), 557–579. Santosa, C.M., Strong, C.M., Nowakowska, C. et al. (2007). Enhanced creativity in bipolar disorder patients. Journal of Affective Disorders, 100(1), 31–39. Skeem, J.L., Polaschek, D.L., Patrick, C. J. & Lilienfeld, S.O. (2011). Psychopathic personality bridging the gap between scientific evidence and public policy. Psychological Science in the Public Interest, 12(3), 95–162. Smith, S.F., Lilienfeld, S.O., Coffey, K. & Dabbs, J.M. (2013). Are psychopaths and heroes twigs off the same branch? Journal of Research in Personality, 47(5), 634–646. Walsh, A. & Wu, H.H. (2008). Differentiating antisocial personality disorder, psychopathy, and sociopathy. Criminal Justice Studies, 21, 135–152. Widom, C.S. (1977). A methodology for studying noninstitutionalized psychopaths. Journal of ConsultinV
It is a good idea, after posting a comment, to look at the comment to check that it says what you meant it to say, in the way you meant to say it. In this case, you need at the very least to reference the source, replace every carriage return in the source by two, replace the link with one that works, and see if the actual text is what you thought you pasted. (What appears above is truncated.) Most of the references are not referenced in the quoted text and should be cut.
At present, it reads as if you do not intend it to actually be read.
Those who do will find this gem of vacuity:
Leaders have charisma and interpersonal dominance, psychopaths have charisma and interpersonal dominance, therefore… what? And listen to the melody these words are set to:
Not exactly a testable hypothesis, is it?
Was I trying to prove the point that business leaders are definitely psychopaths, or the point that there is a lot of research on the topic?
Have you noticed him handwavey the opposing arguments are?
You just dumped a popular-level article from a psychology magazine with a wall of references. Whatever you were trying to do, it fails to do either of those.
Taking that as “how”, not “him”, I haven’t seen opposing arguments, just a pointing out that the argument for is not well sustained by the sources.
ETA: This is indeed handwaving.
I hate to the bearer of bad news, but the “drown ’em in bullshit” tactic doesn’t work well on LW...
So you looked up the best available evidence , steelmanned it, and then critiques it?
If you’re looking for evidence in the sense of a team of experts conducting the appropriate battery of tests on a random sample of wealthy businessmen to estimate the rate of psychopathy, then no, such evidence does not exist. However, there appear to have been studies performed that secretly embedded psychological profile questions into questionnaires given to businessmen and tried to estimate the rate of psychopathy that way. I found one claim by Paul Babiak that up to 1 in 25 business leaders may be psychopaths (as opposed to ~1% of the general population), but I haven’t been able to find the actual study.
Off my own bookshelf, Robert Hare has a chapter on white-collar psychopathy in Without Conscience.
Given the evidence that Babiak is relying on, how high would you estimate the odds of his estimate being within a factor of 4 of the actual frequency?
A fourth of (1 in 25) is ~1%, or about the prevalence cited for psychopathy in the general population, so if we assume the same definition of psychopathy those odds are pretty good, I’d say. They’d only not fall into that range if business leaders are less likely to be psychopaths, which isn’t an absurd proposition but also isn’t one I’ve seen any evidence for.
First you imply that the “actual frequency” is a feature of the territory—I’m not sure about that at all (the underlying “feature” is likely to be a continuum with a semi-arbitrary threshold imposed by a map).
But the real question we’re discussing is not what the absolute frequency is. The real question is whether sociopaths are more frequent in “business leaders” than in general population. I don’t have any hard data, so I have to rely on priors. My prior would be that the frequency of sociopaths would be lower in “business leaders”.
The reasoning for that goes as follows. Not all sociopaths are high-functioning—some (a lot?) are just mean, grumpy, bitchy common people with not high IQ. Low-functioning sociopaths won’t make it into business leaders—they are much more likely to go e.g. into military forces and/or law enforcement.
High-functioning sociopaths are pretty rare, I think. For them the natural habit would be politics as that’s where a power-hungry charming liar can get the biggest payoffs. Some will be business leaders—Steve Jobs is the prime example—but I don’t think there will be many of those.
Don’t forget that using psychiatric terms as derogatory labels is standard operating procedure :-) “Idiot” used to be a clinical diagnosis, so was “cretin”, “imbecile”, etc. etc.
But, historically, most of the data on sociopaths comes from the ones who end up in jail or therapy. Apart from the recent research, which you are rejecting on the grounds that it contradicts the older data, And your hunches, on which topic...
.....or ’handwaving” as it’s unsympathetically known.
The issues with psychopaths in the workplace is that they’re very good at finding high-ranking patrons to protect them:
-- Robert Hare, Without Conscience Ch. 8
By the time someone complains about the psychopath, upper management has already gotten an earful about how awful the psychopath’s coworkers are and how badly they stifle his ability to shine. Psychopaths are good manipulators. Hare mentions how even he, one of the foremost experts on psychopathy, still gets taken in by them.
This is correct. The Psychopathy Checklist does have an arbitrary cutoff point.
All right. Thanks for sharing that.
I’m still curious about Epictetus’ estimate of Babiak’s claim’s accuracy.
Found an actual study.
Data was collected from 203 participants at the management/executive level in seven companies. Paul Babiak does consulting work and was able to get the cooperation of the organizations involved. He was able to carry out actual interviews and had access to personnel files and various performance appraisals. In addition he was able to observe the individuals at work and interview their coworkers. Based on this information he completed the Psychopathy Checklist-Revised on each participant (consulting with Robert Hare where necessary). It was discovered was that nine participants had a score of 25 or more, which is associated with psychopathy. This comes out to about 1 in 25.
Note: the average scores on the PCL-R were lower than those found in the general public, but the number of psychopaths was higher.
The paper points out the difficulties involved in getting the necessary cooperation to carry out a large-scale study. A sample size of 203 is rather small to get an accurate result.
Given that he says that “In fact, you could be living with or married to one for 20 years or more and not know that person is a psychopath” and “This makes it almost impossible to distinguish between a genuinely talented team leader and a psychopath” I have to ask what kind of a definition for a “psychopath” is he using.
I suppose that would be one that relies on complex tests administered by a professional.
The Guardian article commits the mortal sin of not naming the study or its year or coauthors, so I can’t be sure about this, but when I search Google Scholar for Paul Babiak, I find this 2013 paper by Babiak et al. (Search its title for the full text; I can’t get the link to behave.)
It seems primarily to be about methodology, and gives means and correlations on its own scale but doesn’t venture a conversion to more conventional measures; but when you get right down to what it’s doing, it’s based on anonymous assessments of respondents’ bosses collected through Amazon’s Mechanical Turk, each consisting of 20 questions on a 5-point scale. If this is the method that the study behind the Guardian article is using, I’m very skeptical of its diagnostic validity. Among other things.
“A fairly stupid program (compared to humans) that merely drives a car, just doesn’t have the power to be a risk in the sense that Eliezer worries about.”
Well, not a significant one, anyway. Perhaps a very, very tiny one. Programs powering robot maids such as are being developed in Japan are a higher risk, as the required forms of intelligence are closer to human forms, and thus probably closer to general intelligence.
And again, they could cause (small-scale) human death, if they fail to feed a baby or hug someone too hard, etc.
The market has no central processing unit—do your arguments against asynchronous parallel decentralized programs work equally well regarding the market? True, the market doesn’t always lift its leg when I’d like it to or where I’d like it to, but it does seem to get along in a decephalic fashion.
I think the strength of Brook’s method of robotics is that it allows for another option. Some functions of robotics might be much better to be reflexive: encounter object, lift leg. Some functions might be better if they followed the old model: encounter object, take a picture, analyze picture, take sample from object, analyse sample, etc. I forsee more of both in the future. A rocket lands on a distant world and a swarm of small, reflexive robots pour out. They find interesting things for the more ‘smart’ robots to look at. Or imagine a robot that rolls along mostly on reflex (nice when you have to suddenly pull away from a cliff edge) that can also do analysis of moving targets or disarm land mines or whatever.
But there are central planners in markets, like the Federal Reserve and the World Bank. Admittedly they don’t have anywhere near as much executive power as the human prefrontal cortex has over the human body; but they do act in the same basic executive function.
Moreover, where this fails (e.g. Somalia) it doesn’t fare too well for the functioning of markets.
Trevor Blake misses the point. Of course there are some good design spots in asynchronous space, it’s a huge space, there are bound to be. If you can find them, more power to you.
Venu: And if people (including Brooks) have actually used the architecture for building robots, that at least must be clear proof that there is a real architecture here.
The problem is, almost any AI idea you can think of can be made to work on a few examples, so working on a few examples isn’t evidence that there’s anything there. The deadly words are “now we just have to scale it up”.
derekz: I think subsumption is still popular amongst students and hobbyists.
That is, popular amongst people repeating for themselves the toy examples that other people have used it for already.
I’m surprised nobody put this problem in terms of optimization and “steering the future” (including Eliezer, though I suppose he might have tried to make a different point in his post).
As I see it, robots are a special case of machines intended to steer things in their immediate vicinity towards some preferred future. (The special case is that their acting parts and steering parts are housed in the same object, which is not terribly important, except that the subsumption architecture implies it.)
“Smart” robots have a component analogue to a cortex, which gathers sensory info, models the world, makes plans and directs the acting parts to do things. (This is true even if this component is just 50 lines of code compiled by a smart compiler.) The subsumption-based robots just contain many parts that are hard-wired with “reflexes” for every little connection between them, in such a way that “it just happens” that the ensemble acts intelligently, i.e. steers the future towards some intended subspace.
The fallacy I think is that the “just happens” part is not true. Some process has to generate “optimization rules”; even the very simple reflexes between the components have to be correct (meaning that in the real world steer the future towards something). Subsumption architecture fans will look at something like ants, notice that each element has very simple rules but that the colony works extremely well, and will say “Hey, those are simple rules, I can understand them, I bet I could create another set of simple rules to do what I want. I don’t need to build a complex brain.”
The problem is that “set of simple rules” is not the same as “simple set of rules”: in the ants’ case, millions of years of evolution led to a very complicated set of very simple rules that work well. A subsumtion-oriented programmer can only build a simple set of simple rules; a complex one won’t work.
To give an example, take the “move leg higher” case:
*) In a few millions of years of evolution, creatures whose legs went higher up when hitting obstacles (say, because of random interactions between unrelated cells) got to eat more and be eaten less; so genetic programs for legged creature tend to contain rules like that.
*) A subsumtion-oriented programmer may notice that a leg that hits something should lift higher (either because (a) he thought about it or because (2) he noticed that’s what ants or caterpillars do).
*) A general AI researcher might think about a brain smart enough to decide it should lift a leg over obstacles it wants to cross.
Of course the first programmer will notice that his solution to the “going over obstacles” problem is simpler than building a brain, and it would seem more efficient.
But it’s not more efficient, because (in case a) his very complex, general Natural Intelligence brain thought about it, or (in case b) millions of years of evolution caused this particular adaptation (and his general NI noticed it was useful). There’s also the problem that there are thousands of other things even a simple organism does, and thus thousands (or more) reflexes to add. Either he’ll try to program them (case a). This is steering the future via a Rube Goldberg machine; it can work if you put enormous resources into it, but most likely it will blow in your face (or drop an anvil on you).
Or he’ll turn to evolving algorithms (simulating case b). That would probably work (it did until now), but he probably doesn’t have enough resources (natural evolution is a very inefficient optimizer). And even when he does, he won’t understand how it works.
Or he might notice that even nature found central-processing useful, several times (pretty much everything above worms has a brain of some kind, and brains of arthropods, cephalopods and vertebrates evolved separately). So he’ll turn back to centralized programming: hard as it is, it’s a more efficient investment than the alternative. Note that this doesn’t mean that you need fully-general AI to drive a car. You need all tools you can get your hands on. Saying “reflexes are enough” is like saying “I don’t need no tools, rocks are good enough”.
The idea is similar to swarm logic or artificial life where the concept there is to program a single agent who behaves naturally then put a bunch of them together and you get natural behavior. The idea of emergent behavior from smaller parts is of great interest in the defense industry.
Stanley used a Markov model.
[ad hominems / noncontent deleted]
vendor a: selling apples on wood carts isn’t making as much money as I hoped.
vendor b: maybe we should sell nonapples on nonwood carts.
a: that’s just silly. Which convinces me that we should continue selling non-nonapples on non-nonwood.
… ie, the opposite of stupidity is rarely intelligence, but the opposite of the opposite of stupidity never is.
Human intelligence arose out of a neurological Turing tarpit. It is reasonable to imagine that designing intelligence in less time will take tricks—ones which Mother Nature didn’t use—to get out of the tarpit. It is not reasonable to imagine that there is one magic trick which gets us all the way there. So saying “my trick is better than your trick” is interesting, but don’t mean that the final answer won’t need both or neither.
In fact, as soon as it comes to actual wagon design, “nonwood” can be a useful step. If you are building wagons, and there are 5 wagon stores in town, one of which is the “nonwood” store and has the generally lowest-quality—but still functioning—wagons, and you only have time to visit 2 stores for inspiration, which would you visit? Would you be happy to have the nonwood designs all in one store, or would you rather use some classification system (number of spokes) which split them across several stores? (That is, until metal is the new wood.) Any actual “nonwood” wagon is in fact made of something.
But there isn’t a nonwood store. There is an iron store, and a corn store, and a water store. All these are nonwood, but only one of them is going to make much of a wagon.
I was talking about going to the nonwood-wagon store (ie, looking at nontraditional AI projects), not about going to the nonwood store (ie, writing your AI from scratch in machine code).
Classic… delete the context and throw wild accusations. Good to know your so mature.
Eliezer, you’d have done better to ignore ReadABook’s trash. Hir ignorance of your arguments and expertise was obvious.
I don’t know anything about the specific AI architectures in this post, but I’ll defend non-apples. If one area of design-space is very high in search ordering but very low in preference ordering (ie a very attractive looking but in fact useless idea), then telling people to avoid it is helpful beyond the seemingly low level of optimization power it gives.
A metaphor: religious beliefs constitute a very small and specific area of beliefspace, but that area originally looks very attractive. You could spend your whole life searching within that area and never getting anywhere. Saying “be atheist!” provides an trivial amount of optimization power. But that doesn’t mean it’s of trivial importance in the search for correct beliefs. Another metaphor: if you’re stuck in a ditch, the majority of the effort it takes to journey a mile will be the ten vertical meters it takes to climb to the top.
Saying “not X” doesn’t make people go for all non-X equally. It makes them apply their intelligence to the problem again, ignoring the trap at X that they would otherwise fall into. If the problem is pretty easy once you stop trying to sell apples, then “sell non-apples” might provide most of the effective optimization power you need.
Yvain: It might be a good idea to start looking for marketable non-apples, but you can’t sell generalized non-apples right away. The situation in question is the opposite of what you highlighted: non-apples can act as semantic stopsigns, a Wise and mysterious answer rather than a direction for specific research. Traps of the former domain aren’t balanced out by bewilderment of its negation; it might be easier, or it might be much worse.
Yvain, that’s a fair point. And to the extent that you’ve just got specific bad beliefs infesting your head, “Be atheist!” is operationalizable in a way that “sell nonapples!” is not.
So are you claiming that Brooks’ whole plan was, on a whim, to just do the opposite of what the neats were doing up till then? I thought his inspiration for the subsumption architecture was nature, the embodied intelligence of evolved biological organisms, the only existence proof of higher intelligence we have so far. To me it seems like the neats are the ones searching a larger design space, not the other way around. The scruffies have identified some kind of solution to creating intelligent machines in nature and are targeting a constrained design space inspired by this—the neats on the other hand are trying to create intelligence seemingly out of the platonic world of forms.
“When you say to build a wagon using “wood”, you’re giving much more concrete advice then when you say “not wood”. There are different kinds of wood, of course—but even so, when you say “wood”, you’ve narrowed down the range of possible building materials a whole lot more then when you say “not wood”.”
“When you say to build a wagon using “wood”, you’re giving much more concrete advice THAN when you say “not wood”. There are different kinds of wood, of course—but even so, when you say “wood”, you’ve narrowed down the range of possible building materials a whole lot more THAN when you say “not wood”.”
Returning to the post, I suspect that there is a lack of relevant mathematical theorems in this area. What is needed for example is a theorem which says something like:
“In a sufficiently general environment E the AI planning system must incorporate probabilistic reasoning to satisfy its goals.”
Likewise a theorem characterising which environments require “asynchronous” planning architectures is probably a holy grail in this field.
I realize I am way late for this party, but I would like to make a specific theoretical point about synchronous vs. asynchronous communication. It turns out that, given some components or modules or threads or what-have-you communicating via sending messages to one another, synchronous communication is actually more general than asynchronous in the following technical sense. Once can always use synchronous communication to implement asynchronous communication by throwing another module in the middle to act as a mailbox. On the other hand, given only asynchronous communication, it is impossible to implement synchronous communication. Thus, I dearly hope, unlike nazgulnarsil3, that I never see the day where we forget that synchronous communication is strictly more powerful than asynchronous. Moreover, this is not analogous to the wood/nonwood situation where they are entirely different—in fact, in the synchronous/asynchronous situation, one entirely subsumes the other.
This says nothing about that fact that it requires more computing power to encode asynchrony as synchrony (although only linearly more in the number of communication channels). Moreover, it certainly says nothing about central vs. distributed models, which is an orthogonal issue to synchronous vs. asynchronous communication.
Yes, of course asynchronous systems can be encoded in synchronous. But the opposite is true as well. Asynchronous is not “no sychronization”, but “explicit, as needed, synchronization”, and it can encode fully synchronous architectures at the loss of some efficiency (i.e. gating everything to a central clock unit, that lets everything step forward one unit at a time).
Disclaimer: I’ve worked on designs for asynchronous hardware.
Point taken, but I think it is a matter of definition. Let me rephrase. It is impossible to implement synchronous communication without some synchronization mechanism, and thus entirely asynchronous systems cannot implement synchronous communication.
This statement may be entirely obvious, but I think that is only because thought has been put into properly stating the issue. More to the point, what I am actually trying to say is that the original post seems to imply that settling on synchronous communication is somehow more limiting than settling on asynchronous communication, and that this is false.
You can FAKE asynchrony with a synchronous system (using threading and such), but you can’t really be asynchronous. If the hardware only processes one step at a time, you can’t make it process two steps at once!
The parent seems to have been discussing (a)synchronous messaging, whereas you seem to be discussing (a)synchronous computation.
I just came across this image and thought it was hilarious in this context: nonaspirin
I was with you until that last part about economics. Behavioral economists ARE working on actual detailed models of human economic decisions that use assumptions other than economic rationality. They use things like hyperbolic discounting (as opposed to just “nonexponential” discounting) and distance-mediated altruism (rather than just “nonselfishness”). So they’re not making nonwagons out of nonwood; they’re trying to make cars out of steel. They haven’t finished yet; but it’s really not fair to expect a new paradigm to achieve in 10 years what the old paradigm did in 200.
Maybe this was meant to go against popular accounts of economics… but frankly most popular accounts of economics are so bad they don’t even rise to this level. They are things like “the Mexicans are stealing our jobs” “the bourgeoisie oppress the prolerariat” and “taxation is slavery”, which are just so blatantly wrong they can’t even be formulated in terms of serious economic models.
You’re right, but there’s also a lot of calls for ‘non-apple’ economics out there too. I try to avoid the sources of that material, but it’s far more common than what you’re referring to. I don’t think Robin (and thus Eliezer, in what he appended to the original post) was referring to behavioral economics as an example of the topic of this post.
There’s a middle ground between pop-econ and the behavioral economists. E.g. Austrian economists and some non-economist social scientists often complain about assumptions of rationality and equilibrium without providing much in the way of an alternative.
ETA: Some Austrians even explicitly criticize behavioral economists.
Apologies for commenting almost a decade after most of the comments here, but this is the exact same reason why “using nonlinear models is harder but more realistic”.
The way we were taught math led us to believe that linear models form this space of tractable math, and nonlinear models form this somewhat larger space of mostly intractable math. This is mostly right, but the space of nonlinear models is almost infinitely larger than that of linear models. And that is the reason linear models are mathematically tractable : they form such a small space of possible models. Of course nonlinear models don’t have general formulae that always work : they’re just defined as what is NOT linear. In other words, linear models are severely restricted in the form they can have. When we define another subset of models suitable to the specific thing being modelled, then we will just as easily be able to come up with a set of explicit symbolic formulae. Then it will be just as “tractable” as linear models, even though it’s nonlinear : simply because it has different special properties belonging to its own class of models obeying something just like the law of linearity
Actually infinitely larger :-).
Not necessarily. Some useful classes of models will not have the specific nice properties that linear models have.
Thanks :) Can you elaborate a bit? Are you saying that I overreached, and that largely there should be some transformed domain where the model turns out to be simple, but is not guaranteed to exist for every model?
I’m not sure “overreached” is quite my meaning. Rather, I think I disagree with more or less everything you said, apart from the obvious bits :-).
I don’t think it has anything much to do with the size of the space. Linear things are tractable because vector spaces are nice. The only connection with the niceness of linear models and the fact that they form such a small fraction of all possible models is this: any “niceness” property they have is a constraint on the models that have them, and therefore for something to be very “nice” requires it to satisfy lots of constraints, so “nice” things have to be rare. But “nice, therefore rare” is not at all the same as “rare, therefore smart”.
(We could pick out some other set of models, just as sparse as the linear ones, without the nice properties linear models have. They would form just as small a space of possible models, but they would not be as nice to work with as the linear ones.)
If you mean that being nonlinear doesn’t guarantee anything useful, of course that’s right (and this is the same point about “nonapples” being made by the original article here). Particular classes of nonlinear models might have general formulae, a possibility we’ll come to in a moment.
I’m not sure what that’s putting “in other words”; but yes, being linear is a severe restriction.
No. Not unless we cheat by e.g. defining some symbol to mean “a function satisfying this funky nonlinear condition we happen to be working with right now”. (Which mathematicians sometimes do, if the same funky nonlinear condition comes up often enough. But (1) this is a special case and (2) it still doesn’t get you anything as nice and easy to deal with as linearity does.)
In general, having a narrowly specified set of models suitable to a specific physical phenomenon is no guarantee at all of exact explicit symbolic formulae.
No. Those different special properties may be much less useful than linearity. Linearity is a big deal because it is so very useful. The space of solutions to, I dunno, let’s say the Navier-Stokes equations in a given region and with given boundary conditions is highly constrained; but it isn’t constrained in ways that (at least so far as mathematicians have so far been able to figure out) are as useful as linearity.
So I don’t agree at all that “largely there should be some transformed domain where the model turns out to be simple”. Sometimes that happens, but usually not.
Not necessarily. Closed-form solutions are not guaranteed to exist for your particular subset of models and, in fact, often do not, forcing you to use numeric methods with all the associated problems.
Sorry, hadn’t seen this (note to self: mail alerts).
Is this really true, even if we pick a similarly restricted set of models? I mean, consider a set of equations which can only contain products of a number of variables : like (x_1)^a (x_2)^b = const1 ,(x_1)^d (x_2)^e = const2 .
Is this nonlinear? Yes. Can it be solved easily? Of course. In fact it is easily transformable to a set of linear equations through logarithms.
That’s what I’m kinda getting at : I think there is usually some transform that can convert your problem into a linear, or, in general, easy problem. Am I more correct now?
I don’t think this is true. The model must reflect the underlying reality and the underlying reality just isn’t reliably linear, even after transforms.
Now, historically people used to greatly prefer linear models. Why? Because they were tractable. And for something that you couldn’t convert into linear, well, you just weren’t able to build a good model. However in our computer age this no longer holds.
For an example consider what nowadays is called “machine learning”. They are still building models, but these tend to be highly non-linear models with no viable linear transformations.