- RESEARCH BRIEFINGS
- 18 June 2025
The activity of neuronal cells that release the neurotransmitter dopamine is thought to encode differences between predicted and actual rewards. This ‘prediction error’ is key to reinforcement learning. It emerges that dopamine neurons differ in their timescales for temporal discounting, enabling the brain to implement sophisticated reinforcement-learning algorithms along multiple timescales.
This is a summary of: Masset, P. et al. Multi-timescale reinforcement learning in the brain. Nature 642, 682–690 (2025).
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$32.99 / 30 days
cancel any time
Subscribe to this journal
Receive 51 print issues and online access
$199.00 per year
only $3.90 per issue
Rent or buy this article
Prices vary by article type
from$1.95
to$39.95
Prices may be subject to local taxes which are calculated during checkout
Additional access options:
doi: https://doi.org/10.1038/d41586-025-01867-6
‘Expert opinion’ is published under a CC BY 4.0 licence.
References
Sozou, P. D. Proc. R. Soc. B 265, 2015–2020 (1998).
Schultz, W., Dayan, P. & Montague, P. R. Science 275, 1593–1599 (1997).
Tano, P., Dayan, P. & Pouget, A. Adv. Neural Inform. Process Syst. 33, 13662–13673 (2020).
Kim, H. R. et al. Cell 183, 1600–1616 (2020).
Sousa, M. et al. Nature 642, 691–699 (2025).
.png)
Read the paper: Multi-timescale reinforcement learning in the brain
Spontaneous behaviour is shaped by dopamine in two ways
Dopamine determines how reward overrides risk
