That’s a very good point, CounterBlunder, and I should have highlighted that as well. It is definitely fairly common for cognitive science researchers to never work with or make use of ACT-R. It’s a sub-community within the cognitive science community. The research program has continued past the 90′s, and there’s probably around 100 or so researchers actively using it on a regular basis, but the cognitive science community is much larger than that, so your experience is pretty common.
As for whether ACT-R is “actually amazing and people have been silly to drop it”, well, I definitely don’t think that everyone should be making use of it, but I do think more people should be aware of its advantages, and the biggest reason for me is exactly what you point out about “fitting human reaction times” not being impressive. You’re completely right that that’s a basic feature of many, many models. But the key difference here is that ACT-R uses the same components and same math to fit human reaction times (and error patterns) across many different tasks. That is, instead of making a new model for a new task, ACT-R tries to use the same components, with the same parameter settings, but with perhaps a different set of background knowledge. The big advantage here is that it starts getting away from the over-fitting problem: when dealing with comparisons to human data, we normally have relatively few data points to compare to. And a cognitive model is going to, almost by definition, be fairly complex. So if we only fit to the data available for one task, the worry is that we’re going to have so many free parameters in our model that we can fit anything we like. And there’s also a worry that if I’m developing a cognitive model for one particular task, I might invent some custom component as part of my model that’s really highly specialized and would only ever get used in that one task, which is a bit worrying if I’m aiming for a general cognitive theory. One way around these problems is to find components and parameter settings that works across many different tasks. And right now, the ACT-R community is the biggest cognitive modelling community where there are many different researchers using the same components to do many different tasks.
(Note: by “same components in different tasks” I’m meaning something a lot more specific than something like “use a neural network”. In neural network terms, I’m more meaning something like “train up a neural network on this particular data X and then use that same trained neural network as a component in many different tasks”. After all, people very quickly change tasks and can re-purpose their existing neural networks to do new tasks extremely quickly. This hasn’t been common in neural networks until the recent advent of things like GPT-3. And, personally, I think GPT-3 would make an excellent module to be added to ACT-R, but that’s a whole other discussion.)
As for the paper you linked to, I really like that paper (and I’m even cited in it—yay!), but I don’t think it gives an overview of overarching theories of human cognition. Instead, I think it gives a wonderful list of tasks and situations where we’re going to need some pretty complicated components to perform these different tasks, and it gives a great set of suggestions as to what some of those components might be. But there’s no overarching theory of how we might combine those components together and make them work together and flexibly use them for doing different tasks. And that, to me, is what ACT-R provides an example of. I definitely don’t think ACT-R is the perfect, final solution, but it at least shows an example of what it would be like to coordinate components like that, and applies that to a wider variety of tasks than any particular system discussed in that paper. But lots of the tasks in that paper are also things that are incredibly far away from anything that ACT-R has been applied to, so I’m quite sure that ACT-R will need to change a lot to be expanded to include these sorts of new components needed for these new tasks. Still, it makes a good baseline for what it would take to have a flexible system that can be applied to different tasks, rather than building a new model for each task.
Thanks for such a thoughtful response Terry :). This all makes a ton of sense—I totally agree that the paper doesn’t give an alternative overarching theory, and that no such alternative theory exists. I guess my high-level worry is that, if ACT-R really were a good overarching model of the mind (like a paradigm, in Kuhnian terms), then it would have become standard or widely accepted in the field in the way that good overarching theories/paradigms became standard in other fields? Coming into this, my thought is that we don’t have any good overarching theory of the mind, and that we just don’t understand the mind well enough to make any models like that. But I am really curious about the success of ACT-R that you’re pointing to. If it’s actually a decent model, why do you think it didn’t take over the field (and shrunk to a small group of continuing researchers)? Genuine question, not rhetorical. My prior is that most cognitive scientists would kill for a good paradigm (I certainly would!).
Ooo, very good questions! :) I think there are a few different reasons why.… one small clarification, though, I don’t think ACT-R shrunk to a small group—I’d say more that it gradually grew from a small group (starting out of John Anderson’s lab at CMU) up to about 100 active researchers around the world, and then sort of stabilized at that level for the last decade or two.
But, as for why it didn’t take over everything or at least get more widely known, I’d say one big reason is that the tasks it historically focused on were very specific—usually things involving looking at letters and numbers on a screen and pressing keys on a keyboard or moving a mouse. So lots of the early examples were that sort of specific experimental psychology task. It’s been expanded a lot since then (car driving, for example), but that’s where its history was, and so for people who are interested in different sorts of tasks, I can see them maybe initially feeling like it’s not relevant. And even now, lots of the tasks in the paper you provided are so far away from even modern ACT-R that I can see people believing that they can just ignore ACT-R and try to develop completely new theories instead.
Another more practical reason, however, is that there’s a pretty high barrier to entry to getting into ACT-R, partly due to the fact that the reference implementation is in Lisp. Lisp made tons of sense as being the language to use when ACT-R was first developed, but it’s very hard to find students with experience in Lisp now. There’s been a big movement in the last decade to make alternate implementations of ACT-R (Python ACT-R, ACT-Up, jACT-R), and the latest version of ACT-R has interfaces to other languages which I think will help to make it more accessible. But, even with a more common programming language, there’s still a lot of teaching/training/learning required to get new people used to the core ideas. And even to get people used to the idea of sticking with the constraints of ACT-R. For example, I can remember a student building a model that needed to do mental arithmetic, and it took a while to explain why they couldn’t just say “c = a + b” and have the computer do the math (after all, you’re implementing these models on a computer, and computers are good at math, so why not just do the math that way?). Forcing yourself to break that addition down into steps (e.g. trying to recall the result from memory, or trying to do a memory recall of some similar addition fact and then doing counting to adjust to this particular question, or just doing counting right from the beginning, or doing the manual column-wise addition method in your head) gets pretty complicated, and it can be hard to adjust to that sort of mind-set.
I will note that this high-barrier-to-entry problem is probably true of all other cognitive architectures (e.g. Sigma, Clarion, Soar, Dynamic Field Theory, Semantic Pointer Architecture, etc: https://en.wikipedia.org/wiki/Comparison_of_cognitive_architectures ). But one thing ACT-R did really well is to address this by regularly running a 2-week summer-school (since 1994 http://act-r.psy.cmu.edu/workshops/ ). That seems to me to be a big reason why ACT-R got much more widely used (and thus much more widely tested and evaluated) than the other architectures that are out there. There was an active effort to teach the system and to spread it into new domains, and to combat the common approach in computational cognitive modelling of people sticking with the one model that they (or their supervisor) invented. It’s much more fun to build my own model from scratch and to evaluate it on the one particular task that I had in mind when I was inventing the model. But that just leads to a giant proliferation of under-tested models. :( To really test these theories, we need a community, and ACT-R is the biggest and most stable cognitive architecture community so far. It’d be great to have more such communities, but they’re hard to grow.
That’s a very good point, CounterBlunder, and I should have highlighted that as well. It is definitely fairly common for cognitive science researchers to never work with or make use of ACT-R. It’s a sub-community within the cognitive science community. The research program has continued past the 90′s, and there’s probably around 100 or so researchers actively using it on a regular basis, but the cognitive science community is much larger than that, so your experience is pretty common.
As for whether ACT-R is “actually amazing and people have been silly to drop it”, well, I definitely don’t think that everyone should be making use of it, but I do think more people should be aware of its advantages, and the biggest reason for me is exactly what you point out about “fitting human reaction times” not being impressive. You’re completely right that that’s a basic feature of many, many models. But the key difference here is that ACT-R uses the same components and same math to fit human reaction times (and error patterns) across many different tasks. That is, instead of making a new model for a new task, ACT-R tries to use the same components, with the same parameter settings, but with perhaps a different set of background knowledge. The big advantage here is that it starts getting away from the over-fitting problem: when dealing with comparisons to human data, we normally have relatively few data points to compare to. And a cognitive model is going to, almost by definition, be fairly complex. So if we only fit to the data available for one task, the worry is that we’re going to have so many free parameters in our model that we can fit anything we like. And there’s also a worry that if I’m developing a cognitive model for one particular task, I might invent some custom component as part of my model that’s really highly specialized and would only ever get used in that one task, which is a bit worrying if I’m aiming for a general cognitive theory. One way around these problems is to find components and parameter settings that works across many different tasks. And right now, the ACT-R community is the biggest cognitive modelling community where there are many different researchers using the same components to do many different tasks.
(Note: by “same components in different tasks” I’m meaning something a lot more specific than something like “use a neural network”. In neural network terms, I’m more meaning something like “train up a neural network on this particular data X and then use that same trained neural network as a component in many different tasks”. After all, people very quickly change tasks and can re-purpose their existing neural networks to do new tasks extremely quickly. This hasn’t been common in neural networks until the recent advent of things like GPT-3. And, personally, I think GPT-3 would make an excellent module to be added to ACT-R, but that’s a whole other discussion.)
As for the paper you linked to, I really like that paper (and I’m even cited in it—yay!), but I don’t think it gives an overview of overarching theories of human cognition. Instead, I think it gives a wonderful list of tasks and situations where we’re going to need some pretty complicated components to perform these different tasks, and it gives a great set of suggestions as to what some of those components might be. But there’s no overarching theory of how we might combine those components together and make them work together and flexibly use them for doing different tasks. And that, to me, is what ACT-R provides an example of. I definitely don’t think ACT-R is the perfect, final solution, but it at least shows an example of what it would be like to coordinate components like that, and applies that to a wider variety of tasks than any particular system discussed in that paper. But lots of the tasks in that paper are also things that are incredibly far away from anything that ACT-R has been applied to, so I’m quite sure that ACT-R will need to change a lot to be expanded to include these sorts of new components needed for these new tasks. Still, it makes a good baseline for what it would take to have a flexible system that can be applied to different tasks, rather than building a new model for each task.
Thanks for such a thoughtful response Terry :). This all makes a ton of sense—I totally agree that the paper doesn’t give an alternative overarching theory, and that no such alternative theory exists. I guess my high-level worry is that, if ACT-R really were a good overarching model of the mind (like a paradigm, in Kuhnian terms), then it would have become standard or widely accepted in the field in the way that good overarching theories/paradigms became standard in other fields? Coming into this, my thought is that we don’t have any good overarching theory of the mind, and that we just don’t understand the mind well enough to make any models like that. But I am really curious about the success of ACT-R that you’re pointing to. If it’s actually a decent model, why do you think it didn’t take over the field (and shrunk to a small group of continuing researchers)? Genuine question, not rhetorical. My prior is that most cognitive scientists would kill for a good paradigm (I certainly would!).
Ooo, very good questions! :) I think there are a few different reasons why.… one small clarification, though, I don’t think ACT-R shrunk to a small group—I’d say more that it gradually grew from a small group (starting out of John Anderson’s lab at CMU) up to about 100 active researchers around the world, and then sort of stabilized at that level for the last decade or two.
But, as for why it didn’t take over everything or at least get more widely known, I’d say one big reason is that the tasks it historically focused on were very specific—usually things involving looking at letters and numbers on a screen and pressing keys on a keyboard or moving a mouse. So lots of the early examples were that sort of specific experimental psychology task. It’s been expanded a lot since then (car driving, for example), but that’s where its history was, and so for people who are interested in different sorts of tasks, I can see them maybe initially feeling like it’s not relevant. And even now, lots of the tasks in the paper you provided are so far away from even modern ACT-R that I can see people believing that they can just ignore ACT-R and try to develop completely new theories instead.
Another more practical reason, however, is that there’s a pretty high barrier to entry to getting into ACT-R, partly due to the fact that the reference implementation is in Lisp. Lisp made tons of sense as being the language to use when ACT-R was first developed, but it’s very hard to find students with experience in Lisp now. There’s been a big movement in the last decade to make alternate implementations of ACT-R (Python ACT-R, ACT-Up, jACT-R), and the latest version of ACT-R has interfaces to other languages which I think will help to make it more accessible. But, even with a more common programming language, there’s still a lot of teaching/training/learning required to get new people used to the core ideas. And even to get people used to the idea of sticking with the constraints of ACT-R. For example, I can remember a student building a model that needed to do mental arithmetic, and it took a while to explain why they couldn’t just say “c = a + b” and have the computer do the math (after all, you’re implementing these models on a computer, and computers are good at math, so why not just do the math that way?). Forcing yourself to break that addition down into steps (e.g. trying to recall the result from memory, or trying to do a memory recall of some similar addition fact and then doing counting to adjust to this particular question, or just doing counting right from the beginning, or doing the manual column-wise addition method in your head) gets pretty complicated, and it can be hard to adjust to that sort of mind-set.
I will note that this high-barrier-to-entry problem is probably true of all other cognitive architectures (e.g. Sigma, Clarion, Soar, Dynamic Field Theory, Semantic Pointer Architecture, etc: https://en.wikipedia.org/wiki/Comparison_of_cognitive_architectures ). But one thing ACT-R did really well is to address this by regularly running a 2-week summer-school (since 1994 http://act-r.psy.cmu.edu/workshops/ ). That seems to me to be a big reason why ACT-R got much more widely used (and thus much more widely tested and evaluated) than the other architectures that are out there. There was an active effort to teach the system and to spread it into new domains, and to combat the common approach in computational cognitive modelling of people sticking with the one model that they (or their supervisor) invented. It’s much more fun to build my own model from scratch and to evaluate it on the one particular task that I had in mind when I was inventing the model. But that just leads to a giant proliferation of under-tested models. :( To really test these theories, we need a community, and ACT-R is the biggest and most stable cognitive architecture community so far. It’d be great to have more such communities, but they’re hard to grow.