There is essentially one best-validated theory of cognition.
post by abramdemski · 2021-12-10T15:51:06.423Z · LW · GW · 33 commentsContents
References for ACT-R Books by John R Anderson Other References References for the Common Model of Cognition How should we evaluate all this? None 34 comments
There are many theories of cognition. But if you want to work within a framework with the following properties:
- Explains the major cognitive phenomena we know about.
- Fits experimental data well, down to human reaction times, in a wide variety of psychological experiments.
- Has a relatively complete story about functional neuroanatomy.
Well, then, I'm not aware of any theories which fit the bill as well as ACT-R theory.
You might also be interested in the common model of cognition (initially named standard model of the mind), which is consistent with the ACT-R picture, but also consistent with several competing theories. Think of it as the high-certainty subset of ACT-R.
References for ACT-R
I am no expert on ACT-R, so unfortunatery I can't tell you the best place to get started! However, here are some references.
Books by John R Anderson
John R Anderson is the primary researcher behind ACT-R theory. I have not read all of the following.
- Learning and memory: an integrated approach. This book is not about ACT-R. It's basically a well-written textbook on everything we know about learning and memory. I think of it as John Anderson's account of all the empirical psychological phenomena which he would like to explain with ACT-R.
- How can the human mind occur in the physical universe? This is a book about ACT-R. In this book, John R Anderson seeks to spell out "how the gears clank and how the pistons go and all the rest of that detail" -- which is to say, the inner workings of the brain.
John R Anderson also has several other books, which I haven't looked at very much; so maybe a different one is actually a better starting place.
Other References
Here are some other potentially useful starting places.
References for the Common Model of Cognition
- A Standard Model of the Mind: Toward a Common Computational Framework across Artificial Intelligence, Cognitive Science, Neuroscience, and Robotics. This is the original paper for the common model of cognition. It rests on the observation that many cognitive architectures seem to be converging to common principles, much more than they were even 10 years earlier. To codify this consensus, leading researchers behind three of these theories got together and made a close comparison. (One of the three is my phd advisor, which is how I became aware of this.)
- Empirical Evidence from Neuroimaging Data for a Standard Model of the Mind. What it says on the tin.
If you search "common model of cognition" in Google Scholar, you will find a number of other papers discussing it.
How should we evaluate all this?
- It doesn't scratch the same itch that Solomonoff Induction and other ideas more commonly discussed in this community do. Like, at all. The ACT-R interpretation of the question "how can the human mind occur in the physical universe?" has little to do with embedded agency questions, and much more to do with computational psychology.
- Although it is supposed to be a computational model of cognition, ACT-R is not an AGI. I would say it's not even a proto-AGI. The ACT-R community cares about closely matching human performance in psychological tests. It turns out this task is a lot different from, say, building a high-performance machine learning system. ACT-R is definitely not the latter.
- As such, despite ACT-R's success at replicating human learning behavior in the context of psychology tests, we might broadly conclude that it's missing something which modern machine learning is not missing.
- I'm also not claiming that you should believe ACT-R. If you take existing theories of cognition and compare them in Bayesian terms, I think ACT-R comes out on top. But ACT-R has a lot of pieces, and I don't think all of them are necessarily correct. And a lot of the reason ACT-R comes out on top in evidential terms is that it makes concrete predictions (which fit the data). Many other theories of cognition just aren't at that point, mostly because they don't focus on that so much. But you could plausibly think ACT-R is overfitting by including a new mechanism every time it needs to to fit the data.
- For example, I would say that Solomonoff Induction is in some sense a theory of cognition; as is AIXI. But, clearly, neither are trying very hard to fit data from experimental psychology.
- Nonetheless, I'd wager your personal theory of functional neuroanatomy based on ideas from modern machine learning and/or bayesianism is probably worse overall, at least in terms of fitting with experimental data, which is ACT-R's bread and butter. So it might be useful to study ACT-R, if you're into this kind of thing. It might at least try to explain some things which you hadn't tried for. And it might even have some good ideas about how to do so.
I don't personally think about ACT-R very much, but that's because my thinking on AI alignment has little to do with neuroanatomy-inspired AI. Some other people around here think a lot more about those things. ACT-R theory might be useful to those people? Also, if you care about the nitty gritty of human modeling, EG for the sake of inverse reinforcement learning or other value-learning purposes, ACT-R might be useful. It is, after all, a really sophisticated model of a human.
Personally, I am hoping that learning more about ACT-R theory could help me think about human (ir)rationality in more detail.
33 comments
Comments sorted by top scores.
comment by terry.stewart · 2021-12-11T02:39:42.465Z · LW(p) · GW(p)
Hello everyone, I'm a long-time lurker here, but this is my first time commenting. I'm a researcher at the National Research Council of Canada, and a big part of my research has been about taking ACT-R and figuring out how it could be implemented in the brain: http://terrystewart.ca/
I very much agree with the summary in the main post here. ACT-R is the best current model if you are trying to match human experimental data, including things like accuracy and reaction times. And it's been applied to a much wider variety of tasks than any other competing theory. It's definitely missing lots and lots of things, but it also seems to be getting a lot of things right. In particular, I'm very convinced by the constrained communication between brain modules (the "buffers" in the diagram posted by JenniferRM, each of which is restricted to contain around 7 key-value pairs at any point in time -- the evidence for this tends to come from tasks where in some situations it is very easy to do two things at once and in other situations it is very hard), and the declarative memory system (a beautiful mathematical model where the activation of a memory decays as where is how long it has been since you've used that memory).
For those looking for an introduction to it, I'd recommend "ACT-R: A cognitive architecture for modeling cognition" https://wires.onlinelibrary.wiley.com/doi/abs/10.1002/wcs.1488 (which was also linked in the post).
I'll also agree with JenniferRM's idea that ACT-R is essentially a programming language, but I'll also add that one of the main ideas is that it's a programming language with a lot of weird constraints, but the hope is that the constraints are such that if you build a program to do a task within those constraints, then there's a reasonable chance that the resulting program will be a good model of how humans do that task, including things like making similar mistakes as people do, and taking similar amounts of time. It doesn't always live up to that, but that's the goal.
Replies from: vanessa-kosoy↑ comment by Vanessa Kosoy (vanessa-kosoy) · 2021-12-11T13:18:09.775Z · LW(p) · GW(p)
Hi Terry, can you recommend an introduction for people with mathematics / theoretical computer science background? I glanced at the paper you linked but it doesn't seem to have a single equation, mathematical statement or pseudocode algorithm. There are diagrams, but I have no idea what the boxes and arrows actually represent.
Replies from: terry.stewart, davidad↑ comment by terry.stewart · 2021-12-12T02:11:34.582Z · LW(p) · GW(p)
Hi Vanessa, hmm, very good question. One possibility is to point you at the ACT-R reference manual http://act-r.psy.cmu.edu/actr7/reference-manual.pdf but that's a ginormous document that also spends lots of time just talking about implementation details, because the reference ACT-R implementation is in Lisp (yes, ACT-R has been around that long!)
So, another option would be this older paper of mine, where I attempted to rewrite ACT-R in Python, and so the paper goes through the math that had to be reimplemented. http://act-r.psy.cmu.edu/wordpress/wp-content/uploads/2012/12/641stewartPaper.pdf
comment by JenniferRM · 2021-12-10T18:29:26.558Z · LW(p) · GW(p)
I think I remember hearing about this from you in the past and looking into it some.
I looked into it again just now and hit a sort of "satiety point" (which I hereby summarize and offer as a comment) when I boiled the idea down to "ACT-R is essentially a programming language with architectural inclinations which cause it to be intuitively easy see 1:1 connections between parts of the programs and parts of neurophysiology, such that diagrams of brain wiring, and diagrams of ACT-R programs, are easy for scientists to perceptually conflate and make analogies between... then also ACT-R more broadly is the high quality conserved work products from such a working milieu that survive various forms of quality assurance".
Pictures helped! Without them I think I wouldn't have felt like I understood the gist of it.
This image is a very general version that is offered as an example of how one is likely to use the programming language for some task, I think?
Then you might ask... ok... what does it look like after people have been working on it for a long time? So then this image comes from 2004 research.
My reaction to this latter thing is that I recognize lots of words, and the "Intentional module" being "not identified" jumps out at me and causes me to instantly propose things.
But then, because I imagine that the ACT-R experts presumably are working under self-imposed standards of rigor, I imagine they could object to my proposals with rigorous explanations.
If I said something like "Maybe humans don't actually have a rigorously strong form of Intentionality in the ways we naively expect, perhaps because we sometimes apply the intentional stance to humans too casually? Like maybe instead we 'merely' have imagined goal content hanging out in parts of our brain, that we sometimes flail about and try to generate imaginary motor plans that cause the goal... so you could try to tie the Imaginal, Goal, Retrieval, and 'Declarative/Frontal' parts together until you can see how that is the source of what are often called revealed preferences?"
Then they might object "Yeah, that's an obvious idea, but we tried it, and then looked more carefully and noticed that the ACC doesn't actually neuro-anatomically link to the VLPFC in the way that would be required to really make it a plausible theory of humans"... or whatever, I have no idea what they would really say because I don't have all of the parts of the human brain and their connections memorized, and maybe neuroanatomy wouldn't even be the basis of an ACT-R expert's objection? Maybe it would be some other objection.
...
After thinking about it for a bit, the coolest thing I could think of doing with ACT-R was applying it to the OpenWorm project somehow, to see about getting a higher level model of worms that relates cleanly to the living behavior of actual worms, and their typical reaction times, and so on.
Then the ACT-R model of a worm could perhaps be used (swiftly! (in software!)) to rule out various operational modes of a higher resolution simulation of a less platonic worm model that has technical challenges when "tuning hyperparameters" related to many fiddly little cellular biology questions?
Replies from: terry.stewart, terry.stewart, abramdemski, Vaniver↑ comment by terry.stewart · 2021-12-11T02:55:00.582Z · LW(p) · GW(p)
As someone who can maybe call themselves an ACT-R expert, I think the main thing I'd say about the intentional module being "not identified" is that we don't have any fMRI data showing activity in any particular part of the brain being correlated to the use of the intentional module in various models. For all of the other parts that have brain areas identified, there's pretty decent data showing that correlation with activity in particular brain areas. And also, for each of those other areas there's pretty good arguments that those brain areas have something to do with tasks that involve those modules (brain damage studies, usually).
It's worth noting that there's no particular logical reason why there would have to be a direct correlation between modules in ACT-R and brain areas. ACT-R was developed based on looking at human behaviour and separating things out into behaviourally distinct components. There's no particular reason that separating things out this way must map directly onto physically distinct components. (After all, the web browser and the word processor on a computer are behaviourally distinct, but not physically distinct). But it's been really neat that in the last 20 years a surprising number of of these modules that have been around in various forms since the 70's have turned out to map onto physically distinct brain areas.
↑ comment by JenniferRM · 2021-12-11T04:34:53.399Z · LW(p) · GW(p)
The idea of the physical brain turning out to be similar to ACT-R after the code had been written based on high level timing data and so on... seems like strong support to me. Nice! Real science! Predicting stuff in advance by accident! <3
My memory from exploring this in the past is that I ran into some research with "math problem solving behavior" with human millisecond timing for answering various math questions that might use different methods... Googling now, this Tenison et al ACT-R arithmetic paper might be similar, or related?
With you being an expert, I was going to ask if you knew of any cool problems other than basic arithmetic that might have been explored like the Trolley Problem or behavioral economics or something...
(Then I realized that after I had formulated the idea in specific keywords I had Google and could just search, and... yup... Trolley Problem in ACT-R occurs in a 2019 Masters Thesis by Thomas Steven Highstead that also has... hahahaha, omg! There's a couple pages here reviewing ACT-R work on Asimov's Three Laws!?!)
Maybe a human level question is more like: "As an someone familiar with the field, what is the coolest thing you know of that ACT-R has been used for?" :-)
Replies from: terry.stewart↑ comment by terry.stewart · 2021-12-12T01:57:07.067Z · LW(p) · GW(p)
Yes, that Tenison paper is a great example of arithmetic modelling in ACT-R, and especially connecting it to the modern fMRI approach for validation! For an example of the other sorts of math modelling that's more psychology-experiment-based, this paper gives some of the low-level detail about how such a model would work, and maps it onto human errors:
- "Toward a Dynamic Model of Early Algebra Acquisition" https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.53.5754&rep=rep1&type=pdf
(that work was expanded on a few times, and led to things like "Instructional experiments with ACT-R “SimStudents”" http://act-r.psy.cmu.edu/?post_type=publications&p=13890 where they made a bunch of simulated students and ran them through different teaching regimes)
As for other cool tasks, the stuff about playing some simple video games is pretty compelling to me, especially in as much as it talks about what sort of learning is necessary for the precise timing that develops. http://act-r.psy.cmu.edu/wordpress/wp-content/uploads/2019/03/paper46a.pdf Of course, this is not as good in terms of getting a high score as modern deep learning game-playing approaches, but it is very good in terms of matching human performance and learning trajectories. Another model I find rather cool a model of driving a car, which then got combined with a model of sleep deprivation to generate a model of sleep-deprived driving: http://act-r.psy.cmu.edu/wordpress/wp-content/uploads/2012/12/9822011-gunzelmann_moore_salvucci_gluck.pdf
One other very cool application, I think is the "SlimStampen" flashcard learning tool developed out of Hedderik van Rijn's lab at the University of Groningen, in the Netherlands: http://rugsofteng.github.io/Team-5/ The basic idea is to optimize learning facts from flashcards by presenting a flashcard fact just before the ACT-R declarative memory model predicts that a person is going to forget a fact. This seems to improve learning considerably http://act-r.psy.cmu.edu/wordpress/wp-content/uploads/2012/12/867paper200.pdf and seems to be pretty reliable https://onlinelibrary.wiley.com/doi/epdf/10.1111/tops.12183
Replies from: JenniferRM↑ comment by JenniferRM · 2021-12-12T06:49:41.895Z · LW(p) · GW(p)
The flashcard and curriculum experiments seem really awesome in terms of potential for applications. It feels like the beginnings of the kind of software technology that would exist in a science fiction novel where one of the characters goes into a "learning pod" built by a high tech race, and pops out a couple days layer knowing how to "fly their spaceship" or whatever. Generic yet plausible plot-hole-solving super powers! <3
↑ comment by terry.stewart · 2021-12-11T02:59:31.822Z · LW(p) · GW(p)
As for mapping ACT-R onto OpenWorm, unfortunately ACT-R's at a much much higher level than that. It's really meant for modelling humans -- I seem to remember a few attempts to model tasks being performed by other primates by doing things like not including the Goal Buffer, but I don't think that work went very far, and didn't map well to simpler animals. :(
Replies from: JenniferRM↑ comment by JenniferRM · 2021-12-11T05:08:46.956Z · LW(p) · GW(p)
I wonder if extremely well trained dogs might work?
Chaser seems likely to have learned nouns, names, verbs... with toy names learned on one trial starting at roughly 5 months of age (albeit with a name forgetting curve so additional later exposures were needed for retention).
Having studied her training process, it seems like they taught her the concept of nouns very thoroughly.
Showing "here are N frisbees, after 'take frisbee' each one of them earns a reward" to get the idea of nouns referring to more than one thing demonstrated very thoroughly.
Then maybe "half frisbees, half balls" so that it was clear that "some things are non-frisbees and get no reward".
In demos of names and verbs, after the training, you can watch her looking at things and thinking. Maybe the looking directions and the thinking times could be modeled?
Replies from: terry.stewart↑ comment by terry.stewart · 2021-12-11T22:10:26.324Z · LW(p) · GW(p)
I think that sort of task might be modellable with ACT-R -- the hardest part might be getting or gathering the animal data to compare to! Most of the time ACT-R models are validated by comparing to human data gathered by taking a room full of undergraduates and making them do some task 100 times each. It's a bit trickier to do that with animals. But that does seem like something that would be interesting research for someone to do!
↑ comment by abramdemski · 2021-12-10T19:13:24.266Z · LW(p) · GW(p)
This lines up fairly well with how I've seen psychology people geek out over ACT-R. That is: I had a psychology professor who was enamored with the ability to line up programming stuff with neuroanatomy. (She didn't use it in class or anything, she just talked about it like it was the most mind blowing stuff she ever saw as a research psychologist, since normally you just get these isolated little theories about specific things.)
And, yeah, important to view it as a programming language which can model a bunch of stuff, but requires fairly extensive user input to do so. One way I've seen this framed is that ACT-R lacks domain knowledge (since it is not in fact an adult human), so you can think of the programming as mostly being about hypothesizing what domain knowledge people invoke to solve a task.
The first of your two images looks broken in my browser.
↑ comment by Vaniver · 2021-12-10T21:52:36.267Z · LW(p) · GW(p)
Why do they separate out the auditory world and the environment?
Replies from: terry.stewart↑ comment by terry.stewart · 2021-12-11T03:05:46.659Z · LW(p) · GW(p)
No particularly strong reason -- the main thing is that, when building these models, you also have to build a model of the environment that the system is interacting with. And the codebase for helping people build generic environments is mostly focused on handling key-presses and mouse-movements and visually looking at screens, while there's a separate codebase for handing auditory stimuli and responses, since that's a pretty different sort of behaviour.
comment by CounterBlunder · 2021-12-13T05:01:27.482Z · LW(p) · GW(p)
Long-time lurker, first time commenting. Without necessarily disagreeing on any object-level details, I want to give an alternate perspective. I'm a PhD in computational cog sci, have interacted with most of the top cognitive science departments in the US (e.g. through job search, conferences, etc), and I know literally zero people who use ACT-R for anything. It was never mentioned in any of my grad classes, has never been brought up in any talk I've been to -- I don't even know if I've even seen it ever cited in a paper I've read. I know of it, obviously, and I know that it was super influential back in the 90s, but I'd always just assumed that the research program withered away for some reason (given how little I'd seen it actually being used in top-level research at this point).
This post made me curious how I could have such a different perspective! I don't know whether academic cognitive science is just really segregated and I'm missing all the ACT-R researchers still out there; whether ACT-R was actually amazing and people have been silly to drop it; whether "top-level" research is misleading and actually the good research is being published in lower-tier journals while flashy fad-based results get published in top journals; or whether ACT-R really did fail for some deep reason. I've asked some colleagues why nobody around us uses it anymore, but I haven't gotten any detailed responses yet.
(Also, this is a small thing, but "fitting human reaction times" is not impressive -- that's a basic feature of many, many models.)
So while I don't have any object-level disagreements with this post, it feels like helpful context to know that many, many active computational cognitive scientists would strongly disagree that ACT-R is essentially the one best-validated theory of cognition (to the point where they'd be like "huh? what are you talking about?"). This paper gives what I think is a much more contemporary overview of overarching theories of human cognition.
Replies from: terry.stewart, Mitchell_Porter, abramdemski↑ comment by terry.stewart · 2021-12-14T04:32:59.675Z · LW(p) · GW(p)
That's a very good point, CounterBlunder, and I should have highlighted that as well. It is definitely fairly common for cognitive science researchers to never work with or make use of ACT-R. It's a sub-community within the cognitive science community. The research program has continued past the 90's, and there's probably around 100 or so researchers actively using it on a regular basis, but the cognitive science community is much larger than that, so your experience is pretty common.
As for whether ACT-R is "actually amazing and people have been silly to drop it", well, I definitely don't think that everyone should be making use of it, but I do think more people should be aware of its advantages, and the biggest reason for me is exactly what you point out about "fitting human reaction times" not being impressive. You're completely right that that's a basic feature of many, many models. But the key difference here is that ACT-R uses the same components and same math to fit human reaction times (and error patterns) across many different tasks. That is, instead of making a new model for a new task, ACT-R tries to use the same components, with the same parameter settings, but with perhaps a different set of background knowledge. The big advantage here is that it starts getting away from the over-fitting problem: when dealing with comparisons to human data, we normally have relatively few data points to compare to. And a cognitive model is going to, almost by definition, be fairly complex. So if we only fit to the data available for one task, the worry is that we're going to have so many free parameters in our model that we can fit anything we like. And there's also a worry that if I'm developing a cognitive model for one particular task, I might invent some custom component as part of my model that's really highly specialized and would only ever get used in that one task, which is a bit worrying if I'm aiming for a general cognitive theory. One way around these problems is to find components and parameter settings that works across many different tasks. And right now, the ACT-R community is the biggest cognitive modelling community where there are many different researchers using the same components to do many different tasks.
(Note: by "same components in different tasks" I'm meaning something a lot more specific than something like "use a neural network". In neural network terms, I'm more meaning something like "train up a neural network on this particular data X and then use that same trained neural network as a component in many different tasks". After all, people very quickly change tasks and can re-purpose their existing neural networks to do new tasks extremely quickly. This hasn't been common in neural networks until the recent advent of things like GPT-3. And, personally, I think GPT-3 would make an excellent module to be added to ACT-R, but that's a whole other discussion.)
As for the paper you linked to, I really like that paper (and I'm even cited in it -- yay!), but I don't think it gives an overview of overarching theories of human cognition. Instead, I think it gives a wonderful list of tasks and situations where we're going to need some pretty complicated components to perform these different tasks, and it gives a great set of suggestions as to what some of those components might be. But there's no overarching theory of how we might combine those components together and make them work together and flexibly use them for doing different tasks. And that, to me, is what ACT-R provides an example of. I definitely don't think ACT-R is the perfect, final solution, but it at least shows an example of what it would be like to coordinate components like that, and applies that to a wider variety of tasks than any particular system discussed in that paper. But lots of the tasks in that paper are also things that are incredibly far away from anything that ACT-R has been applied to, so I'm quite sure that ACT-R will need to change a lot to be expanded to include these sorts of new components needed for these new tasks. Still, it makes a good baseline for what it would take to have a flexible system that can be applied to different tasks, rather than building a new model for each task.
Replies from: CounterBlunder↑ comment by CounterBlunder · 2021-12-15T03:58:34.810Z · LW(p) · GW(p)
Thanks for such a thoughtful response Terry :). This all makes a ton of sense -- I totally agree that the paper doesn't give an alternative overarching theory, and that no such alternative theory exists. I guess my high-level worry is that, if ACT-R really were a good overarching model of the mind (like a paradigm, in Kuhnian terms), then it would have become standard or widely accepted in the field in the way that good overarching theories/paradigms became standard in other fields? Coming into this, my thought is that we don't have any good overarching theory of the mind, and that we just don't understand the mind well enough to make any models like that. But I am really curious about the success of ACT-R that you're pointing to. If it's actually a decent model, why do you think it didn't take over the field (and shrunk to a small group of continuing researchers)? Genuine question, not rhetorical. My prior is that most cognitive scientists would kill for a good paradigm (I certainly would!).
Replies from: terry.stewart↑ comment by terry.stewart · 2021-12-15T06:39:57.710Z · LW(p) · GW(p)
Ooo, very good questions! :) I think there are a few different reasons why.... one small clarification, though, I don't think ACT-R shrunk to a small group -- I'd say more that it gradually grew from a small group (starting out of John Anderson's lab at CMU) up to about 100 active researchers around the world, and then sort of stabilized at that level for the last decade or two.
But, as for why it didn't take over everything or at least get more widely known, I'd say one big reason is that the tasks it historically focused on were very specific -- usually things involving looking at letters and numbers on a screen and pressing keys on a keyboard or moving a mouse. So lots of the early examples were that sort of specific experimental psychology task. It's been expanded a lot since then (car driving, for example), but that's where its history was, and so for people who are interested in different sorts of tasks, I can see them maybe initially feeling like it's not relevant. And even now, lots of the tasks in the paper you provided are so far away from even modern ACT-R that I can see people believing that they can just ignore ACT-R and try to develop completely new theories instead.
Another more practical reason, however, is that there's a pretty high barrier to entry to getting into ACT-R, partly due to the fact that the reference implementation is in Lisp. Lisp made tons of sense as being the language to use when ACT-R was first developed, but it's very hard to find students with experience in Lisp now. There's been a big movement in the last decade to make alternate implementations of ACT-R (Python ACT-R, ACT-Up, jACT-R), and the latest version of ACT-R has interfaces to other languages which I think will help to make it more accessible. But, even with a more common programming language, there's still a lot of teaching/training/learning required to get new people used to the core ideas. And even to get people used to the idea of sticking with the constraints of ACT-R. For example, I can remember a student building a model that needed to do mental arithmetic, and it took a while to explain why they couldn't just say "c = a + b" and have the computer do the math (after all, you're implementing these models on a computer, and computers are good at math, so why not just do the math that way?). Forcing yourself to break that addition down into steps (e.g. trying to recall the result from memory, or trying to do a memory recall of some similar addition fact and then doing counting to adjust to this particular question, or just doing counting right from the beginning, or doing the manual column-wise addition method in your head) gets pretty complicated, and it can be hard to adjust to that sort of mind-set.
I will note that this high-barrier-to-entry problem is probably true of all other cognitive architectures (e.g. Sigma, Clarion, Soar, Dynamic Field Theory, Semantic Pointer Architecture, etc: https://en.wikipedia.org/wiki/Comparison_of_cognitive_architectures ). But one thing ACT-R did really well is to address this by regularly running a 2-week summer-school (since 1994 http://act-r.psy.cmu.edu/workshops/ ). That seems to me to be a big reason why ACT-R got much more widely used (and thus much more widely tested and evaluated) than the other architectures that are out there. There was an active effort to teach the system and to spread it into new domains, and to combat the common approach in computational cognitive modelling of people sticking with the one model that they (or their supervisor) invented. It's much more fun to build my own model from scratch and to evaluate it on the one particular task that I had in mind when I was inventing the model. But that just leads to a giant proliferation of under-tested models. :( To really test these theories, we need a community, and ACT-R is the biggest and most stable cognitive architecture community so far. It'd be great to have more such communities, but they're hard to grow.
↑ comment by Mitchell_Porter · 2021-12-13T06:06:42.408Z · LW(p) · GW(p)
Does that paper actually mention any overall models of the human mind? It has a list of ingredients, but does it say how they should be combined?
Replies from: CounterBlunder↑ comment by CounterBlunder · 2021-12-15T04:00:18.868Z · LW(p) · GW(p)
No you're right, it doesn't say how they should be combined. My assumption -- and I suspect the assumption of the authors -- is that we have no good widely-accepted overarching model of the mind, and that the best we can agree on is a list of ingredients (and even that list was controversial, e.g. in the commentaries on the paper). I think that's the reason I, implicitly, was viewing the paper as a contemporary alternative to ACT-R. But I take your point that it's doing different things.
↑ comment by abramdemski · 2021-12-14T18:25:59.231Z · LW(p) · GW(p)
I think my post (at least the title!) is essentially wrong if there are other overarching theories of cognition out there which have similar track records of matching data. Are there?
By "overarching theory" I mean a theory which is roughly as comprehensive as ACT-R in terms of breadth of brain regions and breadth of cognitive phenomena.
As someone who has also done grad school in cog-sci research (but in a computer science department, not a psychology department, so my knowledge is more AI focused), my impression is that most psychology research isn't about such overarching theories. To be more precise:
- There are cognitive architecture people, who work on overarching theories of cognition. However, ACT-R stands out amongst these as having extensive experimental validation. The rest have relatively minimal direct comparisons to human data, or none.
- There are "bayesian brain" and other sorta overarching theories, but (to my limited knowledge!) these ideas don't have such a fleshed-out computational model of the brain. EG, you might apply bayesian-brain ideas to create a model of (say) emotional processing, but it isn't really part of one big model in quite the way ACT-R allows.
- There's a lot of more isolated work on specific subsystems of the brain, some of which is obviously going to be highly experimentally validated, but, just isn't trying to be an overaching model at all.
So my claim is that ACT-R occupies a unique position in terms of (a) taking an experimental-psych approach, while (b) trying to provide a model of everything and how it fits together. Do you think I'm wrong about that?
I think it's a bit like physics: outsiders hear about these big overarching theories (GUTs, TOEs, strings, ...), and to an extent it makes sense for outsiders to focus on the big picture in that way. Working physicists, on the other hand, can work on all sorts of specialized things (the physics of crystal growth, say) without necessarily worrying about how it fits into the big picture. Not everyone works on the big-picture questions.
OTOH, I also feel like it's unfortunate that more work isn't integrated into overarching models.
This paper gives what I think is a much more contemporary overview of overarching theories of human cognition.
I've only skimmed it, but it seems to me more like a prospectus which speculates about building a totally new architecture (combining the strengths of deep learning with several handpicked ideas from psychology), naming specific challenges and possible routes forward for such a thing.
(Also, this is a small thing, but "fitting human reaction times" is not impressive -- that's a basic feature of many, many models.)
I said "down to reaction times" mostly because I think this gives readers a good sense of the level of detail, and because I know reaction times are something ACT-R puts effort into, as opposed to because I think reaction times is the big advantage ACT-R has over other models; but, in retrospect this may have been misleading.
I guess it comes down to my AI-centric background. For example, GPT-3 is in some sense a very impressive model of human linguistic behavior; but, it makes absolutely no attempt to match human reaction times. It's very rare for ML people to be interested in that sort of thing. This also relates to the internal design of ACT-R. An AI/ML programmer isn't usually interested in purposefully slowing down operations to match human performance. So this would be one of the most alien things about the ACT-R codebase for a lot of people.
Replies from: CounterBlunder↑ comment by CounterBlunder · 2021-12-15T05:04:30.337Z · LW(p) · GW(p)
Thanks for the thoughtful response, that perspective makes sense. I take your point that ACT-R is unique in the ways you're describing, and that most cognitive scientists are not working on overarching models of the mind like that. I think maybe our disagreement is about how good/useful of an overarching model ACT-R is? It's definitely not like in physics, where some overarching theories are widely accepted (e.g. the standard model) even by people working on much more narrow topics -- and many of the ones that aren't (e.g. string theory) are still widely known about and commonly taught. The situation in cog sci (in my view, and I think in many people's views?) is much more that we don't have an overarching model of the mind in anywhere close to the level of detail/mechanistic specificity that ACT-R posits, and that any such attempt would be premature/foolish/not useful right now. Like, I think if you polled cognitive scientists, the vast majority would disagree with the title of your post -- not because they think there's a salient alternative, but because they think that there is no theory that even comes close to meriting the title of "best-validated theory of cognition" (even if technically one theory is ahead of the others). Do you know what I mean? Of course, even if most cognitive scientists don't believe in ACT-R in that way, that alone doesn't mean that ACT-R is wrong.. I'm curious about the evidence that Terry is talking about above. I just think the field would look really, really different if we actually had a halfway-decent paradigm/overarching model of the mind. And it's not like ACT-R is some unknown idea that is poised to take over the field once people learn about it. Everyone knew about it in the 90s, and then it fell out of widespread use -- and my prior on why that happened is that people weren't finding it super useful. (Although like I said, I'm really curious to learn more about what Terry/other contemporary people are doing with it!)
Replies from: terry.stewart, abramdemski↑ comment by terry.stewart · 2021-12-15T07:15:07.591Z · LW(p) · GW(p)
I agree that there isn't an overarching theory at the level of specificity of ACT-R that covers all the different aspects of the mind that cognitive science researchers wish it would cover. And so yes, I can see cognitive scientists saying that there is no such theory, or (more accurately) saying that even though ACT-R is the best-validated one, it's not validated on the particular types of tasks that they're interested in, so therefore they can ignore it.
However, I do think that there's enough of a consensus about some aspects of ACT-R (and other theories) that there are some broader generalizations that all cognitive scientists should be aware of. That's the point of the two papers listed in the original post on the "Common Model of Cognition". They dig through a whole bunch of different cognitive architectures and ideas over the decades and point out that there are some pretty striking commonalities and similarities across these models. (ACT-R is just one of the theories that they look at, and they point out that there are a set of commonalities across all the theories, and that's what they call the Common Model of Cognition). The Common Model of Cognition is much more loosely specified and is much more about structural organization rather than being about the particular equations used, though, so I'd still say that ACT-R is the best-validated model. But CMC is surprisingly consistent with a lot of models, and that's why the community is getting together to write papers like that. The whole point is to try to show that there are some things that we can say right now about an overarching theory of the mind, even if people don't want to buy into the particular details of ACT-R. And if people are trying to build overarching theories, they should at least be aware of what there is already.
(Full disclosure: I was at the 2017 meeting where this community came together on this topic and started the whole CMC thing. The papers from that meeting are at https://www.aaai.org/Library/Symposia/Fall/fs17-05.php and that's a great collection of short papers of people talking about the various challenges of expanding the CMC. The general consensus from that meeting is that it was useful to at least have an explicit CMC to help frame that conversation, and it's been great to see that conversation grow over the last few years. Note: at the time we were calling it the Standard Model of the Mind, but that got changed to Common Model of Cognition).
↑ comment by abramdemski · 2021-12-15T17:26:14.105Z · LW(p) · GW(p)
I think maybe our disagreement is about how good/useful of an overarching model ACT-R is? It's definitely not like in physics, where some overarching theories are widely accepted (e.g. the standard model) even by people working on much more narrow topics -- and many of the ones that aren't (e.g. string theory) are still widely known about and commonly taught. The situation in cog sci (in my view, and I think in many people's views?) is much more that we don't have an overarching model of the mind in anywhere close to the level of detail/mechanistic specificity that ACT-R posits, and that any such attempt would be premature/foolish/not useful right now.
Makes some sense to me! This is part of why my post's conclusion said stuff like this doesn't mean you should believe in ACT-R. But yeah, I also think we have a disagreement somewhere around here.
I was trained in the cognitive architecture tradition, which tends to find this situation unfortunate. I have heard strong opinions, which I respect and generally believe, of the "we just don't know enough" variety which you also espouse. However, I also buy Allen Newell's famous argument in "you can't play 20 questions with nature and win", where he argues that we may never get there without focusing on that goal. From this perspective, it makes (some) sense to try to track a big picture anyway.
In some sense the grand goal of cognitive architecture is that it should eventually be seen as standard (almost required) for individual works of experimental psychology to contribute to a big picture in some way. Imagine for a moment if every paper had a section relating to ACT-R (or some other overarching model), either pointing out how it fits in (agreeing with and extending the overarching model) or pointing out how it doesn't (revising the overarching model).
With the current state of things, it's very unclear (as you highlighted in your original comment) what the status of overarching models like ACT-R even is. Is it an artifact from the 90s which is long-irrelevant? Is it the state of the art big-picture? Nobody knows and few care? Wouldn't it be better if it were otherwise?
On the other hand, working with cognitive architectures like ACT-R can be frustrating and time consuming. In theory, they could be a time-saving tool (you start with all the power of ACT-R and can move forward from that!). In practice, my personal observation at least is that they add time and reduce other kinds of progress you can make. To caricaturize, a cog arch phd student spends their first 2 years learning the cognitive architecture they'll work with, while a non-cog-arch cogsci student can hit the ground running instead. (This isn't totally true of course; I've heard people say that most phd students are not really productive for their first year or two of grad school.) So I do not want to gloss over the downsides to a cog arch focus.
One big problem is what I'll call the "task integration problem". Let's say you have 100 research psychologists who each spend a chunk of time doing "X in ACT-R" for many different values of X. Now you have lots of ACT-R models of lots of different cognitive phenomena. Can you mash them all together into one big model which does all 100 things?
I'm not totally sure about ACT-R, but I've heard that for most cognitive architectures, the answer is "no". Despite existing in one cognitive architecture, the individual "X" models are sorta like standalone programs which don't know how to talk to each other.
This undermines the premise of cog arch as helping us fit everything into one coherent picture. So, this is a hurdle which cog arch would have to get past in order to play the kind of role it wants to play.
comment by Jon Garcia · 2021-12-10T18:24:44.881Z · LW(p) · GW(p)
So, from what I read, it looks like ACT-R is mostly about modeling which brain systems are connected to which and how fast their interactions are, not in any way how the brain systems actually do what they do. Is that fair? If so, I could see this framework helping to set useful structural priors for developing AGI (e.g., so we don't make the mistake of hooking up the declarative memory module directly to the raw sensory or motor modules), but I would expect most of the progress still to come from research in machine learning and computational neuroscience.
Replies from: abramdemski↑ comment by abramdemski · 2021-12-10T18:55:58.746Z · LW(p) · GW(p)
I think that's not quite fair. ACT-R has a lot to say about what kinds of processing are happening, as well. Although, for example, it does not have a theory of vision (to my limited understanding anyway), or of how the full motor control stack works, etc. So in that sense I think you are right.
What it does have more to say about is how the working memory associated with each modality works: how you process information in the various working memories, including various important cognitive mechanisms that you might not otherwise think about. In this sense, it's not just about interconnection like you said.
Replies from: Jon Garcia↑ comment by Jon Garcia · 2021-12-10T19:15:07.858Z · LW(p) · GW(p)
So essentially, which types of information get routed for processing to which areas during the performance of some behavioral or cognitive algorithm, and what sort of processing each module performs?
Replies from: terry.stewart↑ comment by terry.stewart · 2021-12-11T03:12:37.227Z · LW(p) · GW(p)
That sounds right to me. It gives what types of information are processed in each area, and it gives a very explicit statement about exactly what processing each module performs.
So I look at ACT-R as sort of a minimal set of modules, where if I could figure out how to get neurons to implement the calculations ACT-R specifies in those modules (or something close to them), then I'd have a neural system that could do a very wide variety of psychology-experiment-type-tasks. As far as current progress goes, I'd say we have a pretty decent way to get neurons to implement the core Production system, and the Buffers surrounding it, but much less of a clear story for the other modules.
comment by Steven Byrnes (steve2152) · 2021-12-10T19:54:58.360Z · LW(p) · GW(p)
Just bought myself a copy of the "Physical Universe" book! :-)
Replies from: Robbo, abramdemski↑ comment by abramdemski · 2021-12-10T20:38:00.667Z · LW(p) · GW(p)
Hope it turns out to be interesting to you!
comment by philh · 2021-12-12T22:00:24.463Z · LW(p) · GW(p)
How does this relate to predictive processing that Scott's written about on SSC?
My vague sense from the post and comment thread (and my memories of what I read on PP) is that ACT-R is higher-level, and might be compatible with PP explaining some of the lower level details. But super not confident on that.
Replies from: abramdemski↑ comment by abramdemski · 2021-12-14T17:55:28.388Z · LW(p) · GW(p)
I would guess similarly. Personally, I'm not especially fond of PP, although that is a bigger discussion.