A liquid striatal microcircuit model for trajectory learning
CBN (Computational Biology and Neurocomputing) seminars
Friday 14 September 2012
to 11:00 at
Carlos Toledo Suarez (University of Freiburg, Bernstein Center Freiburg, Germany)
In reinforcement learning theories of the basal ganglia, dopamine is assumed to act as an error signal guiding the update of the values of such states during the learning process. Although it has been shown that a realistic dopaminergic error signal can drive the variant of RL known as temporal-difference learning  this study relied on a pre-defined partitioning of the environment into discrete states that were encoded as the firing rate of disjunct sets of neurons. A more likely scenario is that neurons are involved in the encoding of multiple different states through their spike patterns, and that an appropriate partitioning of an environment is learned on the basis of the actions leading to highest cumulative reward, such that patterns associated with the same actions are classified together. This is equivalent to a reduction in the effective number of states involved.
Here I present a microcircuit model of striatum that reproduces experimentally observed activity statistics  and the use of its transient high-dimensional dynamics  i.e. liquid state dynamics for the supervised learning of trajectories on a flat surface employing only four simple linear readouts. I show performance and generalization scans over scales of cortico-striatal versus intra-striatal synaptic weights for multiple instantiations of the circuit, and their comparison against a measure of circuit's sensitivity to small input perturbations i.e. chaotic behavior.