Figure \(10.16\): Hierarchical action selection across multiple prefrontal basal ganglia loops. On the far right, at the most anterior level, the PFC represents contextual information that is gated by its corresponding BG loop based on the probability that maintaining this context for guiding lower level actions is predictive of reward. The middle loop involves both input and output gating. The input gating mechanism allows stimulus representations S to update a PFC_maint layer, while the output gating mechanism gates out a subset of maintained information conditional on the context in anterior PFC. Its associated BG layer learns the reward probability of output gating given the maintained stimulus S and the context. Finally, the left-most motor loop learns to gate simple motor responses based on their reward probabilities conditional on the stimulus, as in the single loop BG model described in the Motor chapter, but where here relevant stimulus features are selected by the more anterior loops. Reproduced from Frank & Badre (2012).
For related models simulating hierarchical control over action across multiple PFC-BG circuits, see Reynolds & O'Reilly, 2009; Frank & Badre, 2012 and Collins & Frank (2013). The latter model considers situations in which there are multiple potential rule sets signifying which actions to select in particular sensory states, and where the appropriate rule set might depend on a higher level context. (For example, your tendency to greet someone with a hug, kiss, handshake, or wave might depend on the situation: your relationship to the person, whether you are in the street or at work, etc. And when you go to a new country (or city), the rule set to apply may be the same as that you've applied in other countries, or it might require creating a new rule set). More generally, we refer to the higher level rule as a "task-set" which contextualizes how to act in response to many different stimuli. Hierarchical PFC-BG networks can learn to create these PFC task-sets, and simultaneously, which actions to select in each task-set. Critically, with this hierarchical representation, the learned PFC representations are abstract and independent of the contexts that cue them, facilitating generalization and transfer to other contexts, while also identifying when new task-sets need to created. They also allow for new knowledge to be appended to existing abstract task structures, which then can be immediately transferred to other contexts that cue them (much like learning a new word in a language: you can immediately then re-use that word in other contexts and with other people). To see this network in action, including demonstrations of generalization and transfer, see the Collins & Frank network linked here. Various empirical data testing this model have shown that indeed humans (including babies!) represent such task-sets in a hierarchical manner (even when not cued to do so, and even when it is not beneficial for learning) in such a way that facilitates generalization and transfer; and that the extent of this hierarchical structure is related to neural signatures in PFC and BG (see e.g., Badre & Frank, 2012; Collins et al., 2014; Collins & Frank, 2016; Werchan et al, 2016).
To put many of the elements explored above to their most important use, we explore how the coordinated interactions of various regions of the PFC (including the affective areas explored previously), together with BG gating, enable the system to behave in a coherent, task-driven manner over multiple sequential steps of cognitive processing. This is really the hallmark of human intelligence: we can solve complex problems by performing a sequence of simpler cognitive steps, in a flexible, adaptive manner. More abstract cognitive models such as ACT-R provide a nice characterization of the functional properties of this level of cognition. The goal with the model we explore here is to understand how more detailed neural mechanisms can work together to produce this functionality.
- Higher (more anterior) levels of PFC encode context/goals/plans to organize sequence of cognitive actions, which are driven by more lower, more posterior PFC areas. Critically, these higher areas do not specify rigid sequences of actions, but rather encode the desired outcome states of the sequence of actions, and provide appropriate context so that appropriate lower-level steps will be selected.
- Each step in a sequence of actions involves a consideration of the reward outcomes and effort costs of the action relative to other possible options.
TODO: invent this model!