Robert Lindsey

University of Colorado at Boulder
Department of Computer Science


In 2014, I completed my PhD in Computer Science at the University of Colorado under the supervision of Michael Mozer. My research interests lie at the intersection of machine learning and cognitive science. My PhD research focused on using probabilistic models to build smarter educational software. I'm currently the chief scientist at Imagen Technologies, a computer vision startup working to revolutionize healthcare by making the expertise of world-leading doctors available to everyone.

Click here for my CV.
Google Scholar

You can reach me by email at firstname[dot]

2008 - 2014

Ph.D., Computer Science
University of Colorado at Boulder
Research Adviser: Michael Mozer

2005 - 2008

B.S., Dual Major: Computer Science and Philosophy
Rensselaer Polytechnic Institute
Summa Cum Laude
Research Adviser: Wayne Gray



Mozer, M. C., & Lindsey, R. V. (2016). Predicting and improving memory retention: Psychological theory matters in the big data era. To appear in M. Jones (Ed.), Big Data in Cognitive Science. Taylor & Francis. [PDF]

Khajah, M., Lindsey, R., & Mozer, M. C. (2016). How deep is knowledge tracing? To appear in Proceedings of the Ninth International Conference on Educational Data Mining. Educational Data Mining Society Press. [Awarded Best Overall Paper at EDM 2016] [PDF] [Github]

Khajah, M., Roads, B., Lindsey, R., Liu, Y., & Mozer, M. C. (2016). Designing engaging games using Bayesian optimization. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems (pp. 5571-5582). New York: ACM.[PDF]

Lindsey, R., Khajah, M., & Mozer, M. C. (2014). Automatic discovery of cognitive skills to improve the prediction of student learning. In Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, & K. Q. Weinberge (Eds.), Advances in Neural Information Processing Systems 27 (pp. 1386-1394). La Jolla, CA: Curran Associates Inc. [PDF][Github]

Kang, S. H. K., Lindsey, R., Mozer, M. C., & Pashler, H. (2014). Retrieval practice over the long term: Should spacing be expanding or equal-interval? Psychonomic Bulletin & Review, 21, 1544-50. [PDF]

Lindsey, R., Shroyer, J. D., Pashler, H., & Mozer, M. C. (2014). Improving student's long-term knowledge retention with personalized review. Psychological Science, Vol. 25(3), 639-647. [PDF]

Khajah, M., Lindsey, R., & Mozer, M. C. (2014). Maximizing students' retention via spaced review: Practical guidance from computational models of memory. Topics in Cognitive Science, 6, 157-169. [PDF]

Khajah, M., Wing, R. M., Lindsey, R. V., & Mozer, M. C. (2014) Incorporating latent factors into knowledge tracing to predict individual differences in learning. In J. Stamper, Z. Pardos, M. Mavrikis, & B. M. McLaren (Eds), Proceedings of the 7th International Conference on Educational Data Mining (pp. 99-106). Educational Data Mining Society Press. [Awarded Best Overall Paper at EDM 2014] [PDF]

Lindsey, R., Mozer, M. C., Huggins, W. J., & Pashler, H. (2013). Optimizing instructional policies. In C.J.C. Burges et al. (Eds.), Advances in Neural Information Processing Systems 26. La Jolla, CA: NIPS Foundation. [PDF]

Khajah, M., Lindsey, R., & Mozer, M. C. (2013). Maximizing students' retention via spaced review: Practical guidance from computational models of memory. In M. Knauff, M. Pauen, N. Sebanz, & I. Wachsmuth (Eds.), Proceedings of the 35th Annual Conference of the Cognitive Science Society (pp. 758-763). Austin, TX: Cognitive Science Society. [Awarded the Cognitive Science Society Computational Modeling Prize] [PDF]

Lindsey, R., Headden, W. P., Stipicevic, M.J. (2012). A Phrase-Discovering Topic Model Using Pitman-Yor Processes. Empirical Methods in Natural Language Processing, 2012.

Mozer, M. C., Pashler, H., Wilder, M., Lindsey, R., Jones, M. C., & Jones, M. N. (2010). Decontaminating human judgments to remove sequential dependencies. In J. Lafferty, C. K. I. Williams, J. Shawe-Taylor, R. S. Zemel, & A. Culota (Eds.), Advances in Neural Information Processing Systems 23 (pp. 1705-1713). La Jolla, CA: NIPS Foundation. [PDF]

Lindsey, R., Lewis, O., Pashler, H., & Mozer, M. C. (2010). Predicting students' retention of facts from feedback during training. In S. Ohlsson & R. Catrambone (Eds.), Proceedings of the 32nd Annual Conference of the Cognitive Science Society. Austin, TX: Cognitive Science Society. [PDF]

Mozer, M. C., Pashler, H., Cepeda, N., Lindsey, R., & Vul, E. (2009). Predicting the optimal spacing of study: A multiscale context model of memory. In Y. Bengio, D. Schuurmans, J. Lafferty, C.K.I. Williams, & A. Culotta (Eds.), Advances in Neural Information Processing Systems 22 (pp. 1321–1329). La Jolla, CA: NIPS Foundation. [PDF]

Lindsey, R., Mozer, M., Cepeda, N. J., & Pashler, H. (2009). Optimizing Memory Retention with Cognitive Models. In A. Howes, D. Peebles, R. Cooper (Eds.), 9th International Conference on Cognitive Modeling – ICCM2009, Manchester, UK. [PDF]

Lindsey, R., Stipicevic, M., Veksler, V.D., & Gray, W.D. (2008). BLOSSOM: Best path Length on a Semantic Self-Organizing Map. In B. C. Love, K. McRae, & V. M. Sloutsky (Eds.), Proceedings of the 30th Annual Conference of the Cognitive Science Society (pp. 481-487). Austin, TX: Cognitive Science Society. [PDF]

Lindsey, R., Veksler, V. D., Grintsvayg, A., & Gray, W. D. (2007). Effects of Corpus Selection on Measuring Semantic Relatedness. Proceedings of the 8th International Conference on Cognitive Modeling, Ann Arbor, MI. [PDF]

Grintsvayg, A., Veksler, V. D., Lindsey, R., & Gray, W. D. (2007). Vector Generation from an Explicitly-defined Multidimensional Space. Proceedings of the 8th International Conference on Cognitive Modeling, Ann Arbor, MI.

Veksler, V. D., Grintsvayg, A., Lindsey, R., & Gray, W. D. (2007). A proxy for all your semantic needs. Proceedings of the 29th Annual Meeting of the Cognitive Science Society, Nashville, TN.

Working Papers

Lindsey, R., Mozer, M.C., Pashler, H. Predicting Individual Differences in Student Learning via Collaborative Filtering.

Lindsey, R., Polsdofer, E., Mozer, M.C., Kang, S., H., K., & Pashler, H. Long-term recency is nothing more than ordinary forgetting. [PDF]

Mozer, M. C., Pashler, H., Lindsey, R., & Jones, J. Efficient training of visual search via attentional highlighting.

Recent Research Projects

Optimization of Learning via Cognitive Modeling

This project involves using computer models of human memory to schedule study. Just as physical processes like the weather can be successfully modeled, so too can cognitive processes be modeled. Similarly, just as the weather can be forecasted from computer models, “memory” can be “forecasted” from a computer model of human memory. For example, it is possible to predict with reasonable accuracy what percentage of facts a student will recall at a future date based on when and how long they previously studied.

Because we can predict a student’s performance at a test as a function of their study schedule (i.e., when he or she studied), we can also predict what study schedule will maximize the student’s performance at test. We are working to incorporate methods of predicting optimal study schedules into tutoring software, thereby improving students' acquisition and retention study material (above and beyond the improvement existing tutoring software provides).

Control of Visual Attention

People have little trouble following instructions. For example, if you are asked to "search the room for a set of keys", you can readily perform the task. How is it that we can configure and reconfigure our visual and attentional systems to perform a wide variety of arbitrary tasks? And how do we become more efficient with experience performing a task? To answer these questions, we require an understanding of the mechanisms underlying the control of visual attention.

My participation in this project has involved an experiment wherein we are using an 'invisible target' search task to see if we can train individuals more efficiently to look in the appropriate position. Training individuals is time consuming for many complex visual domains (e.g., fingerprint matching, flying a plane, controlling air traffic, screening baggage) where an individual needs to know where to look in an image given contextual information.

Long Term Recency Effects

When tested on a list of items, individuals show a recency effect: the more recently a list item was presented, the more likely it is to be recalled. For short interpresentation intervals (IPIs) and retention intervals (RIs), this effect may be attributable to working memory. However, recency effects also occur over long timescales where IPIs and RIs stretch into the weeks and months. These long-term recency (LTR) effects have intrigued researchers because of their scale-invariant properties and the sense that understanding the mechanisms of LTR will provide insights into the fundamental nature of memory. An early explanation of LTR posited that it is a consequence of memory trace decay, but this decay hypothesis was discarded in part because LTR was not observed in continuous distractor recognition memory tasks (Glenberg & Kraus, 1981; Bjork & Whitten, 1974; Poltrock & MacLeod, 1977). Since then, a diverse collection of elaborate mechanistic accounts of LTR have been proposed. In this article, we revive the decay hypothesis. Based on the uncontroversial assumption that forgetting occurs according to a power-law function of time, we argue that not only is the decay hypothesis a sufficient qualitative explanation of LTR, but also that it yields excellent quantitative predictions of LTR strength as a function of list size, test type, IPI, and RI. Through fits to a simple model, this project aims to bring resolution to the subject of LTR by arguing that LTR is nothing more than ordinary forgetting.

Multiscale Context Model of Memory

The way we study material influences how well we retain it. Psychologists have established that spaced practice leads to better retention of declarative knowledge (facts such as foreign language vocabulary) than massed practice. However, the exact relationship between spacing and retention depends in a significant way on the duration of time over which the material must be retained. We are exploring existing and novel computational models to explain a range of data on massed versus spaced practice.

We have developed a Multiscale Context Model (MCM) that predicts the optimal spacing of study in a wide range of experiments, over retention intervals ranging from minutes to a year. The model makes surprising and counterintuitive predictions that we are currently testing.

In other projects, we are developing mechanistic accounts that explain other phenomena surrounding the conditions under which individuals are likely to learn and retain material. For example, when an individual is tested and then told the correct answer, their learning is better than when they merely study the material. We explain this finding with an error correct learning account in which a better error signal is obtained when the individual produces a response, even if incorrect. We've also addressed the counterintuitive result that when individuals are willing to guess an answer, even when they are wrong, they will learn the material with less study than when they are unwilling to venture a guess. We have developed a model that links the strength of learning to willingness to guess (or confidence in a guess).

Decontamination of Sequential Effects

For over half a century, psychologists have been struck by how poor people are at expressing their internal sensations, impressions, and evaluations via rating scales. When individuals make judgments, they are incapable of using an absolute rating scale, and instead rely on reference points from recent experience. This relativity of judgment limits the usefulness of responses provided by individuals to surveys, questionnaires, and evaluation forms. Fortunately, the cognitive processes that transform internal states to responses are not simply noisy, but rather are influenced by recent experience in a lawful manner.

We explore techniques to remove sequential dependencies, and thereby decontaminate a series of ratings to obtain more meaningful human judgments. In our formulation, decontamination is fundamentally a problem of inferring latent states (internal sensations) which, because of the relativity of judgment, have temporal dependencies. We propose a decontamination solution using a conditional random field with constraints motivated by psychological theories of relative judgment. Our exploration of decontamination models is supported by two experiments we conducted to obtain ground-truth rating data on a simple length estimation task. Our decontamination techniques yield an over 20% reduction in the error of human judgments.

Our current work involves expanding this approach from concrete perceptual judgments (e.g., line length) to more abstract judgments (e.g., affective judgment of images or art).


For information about consulting, see Boulder Analytics.