Stark Lab

Neurobiology & Behavior



Formerly known as the Behavioral Pattern Separation Task (BPS-O), we have renamed the same task to Mnemonic Similarity Task (MST). For a full discussion of this name change, go here.  Scroll down if you just want to download it and have no appreciation of history …

The MST (for a while also called the SPST – yes, it’s had a few names!) is a behavioral task designed to tax pattern separation. Pattern separation can really only be assessed by looking at representations of information and we clearly can’t do that behaviorally. But, the goal is to have a task that would place strong demands on a system that performed pattern separation and, in so doing, get some measure of this.

The task consists of assessing recognition memory performance for objects using not only the traditional targets and unrelated foils, but also using similar lures (that intentionally vary along several different dimensions). This certainly isn’t a unique concept. Here, however, we have developed the task since its creation (Kirwan & Stark, 2007, Learning & Memory) to create a set of well-matched stimuli that have been tested extensively both in our lab and in others. Note, the behavioral task is an explicit one that asks participants to respond “Old”, “Similar”, or “New” on each trial (we have done “Old” vs. “New” and even “R”, “K”, “N”). Typically, this has been done in a study-test variant, but we have (often while in the scanner) done a continuous version as well (Yassa et al., 2010, NeuroImage).  A good place to turn for some of the behavioral comparisons is the S. Stark et al. (2015, Behavioral Neuroscience) paper.

A related purely implicit version is often used by us in fMRI studies of pattern separation (e.g., Bakker et al., 2008, Science; Lacy et al., 2011, Learning and Memory). The same stimuli are used and one can use their mnemonic similarity to trace out tuning curves in activity as a function of delta-input. But, in this implicit version, there is no behavioral assay of pattern separation performance.

The main behavioral version, however, is the explicit, study-test version. The task takes about 15 minutes to complete and there are several stimulus sets. Within each set, one can run shorter versions.  This allows for repeat testing (a nice feature of the task is that there is little if any evidence for practice effects). The task can run in Matlab/Octave or as a stand-alone application (Mac and Windows). In general, we recommend using the stand-alone one as it’s the most up to date and has several additional features (e.g., easily run independent sub-sets of each list to give even more test-retest opportunities). There’s even a version written in Kivy Python that allows the task to be run on Windows, OS X, Linux, Android, and iOS, although it’s still in development (svn checkout svn://

A note about stimulus sets.  The latest versions have Sets 1-6 and Sets C, D, E, and F.  The original work (e.g., Kirwan et al., 2007; Bakker et al. 2008) used Sets A and B.  We shuffled these stimuli based on behavioral work (Joyce Lacy) to create matched sets C and D.  So, they’re the same pictures, just shuffled.  Sets E and F were created to add more lists and to make things a bit easier.  So, they’re not directly comparable to C and D.  Performance on them can be converted into C/D performance by transforming the lure-bin axis (see discussion on the Google Group).  I’ve even made a spreadsheet to do this (it non-linearly transforms the lure bin axes).  But, a clear goal is to have matched sets and this is where Sets 1-6 come in.  We’ve taken C-F and two internally-used sets (unsurprisingly named G and H) and shuffled them again based on p(“Old”|Lure) data from young participants.  This gives matched performance across the sets.  However, know that they’re the same pictures as used in C-H.  So, there are 6 independent lists, not 10 or 12.

In addition, experimenter instructions and video instructions for the participant are included.  These instructions are strongly recommended.


  • Kirwan & Stark (2007) Learning and Memory: Development of the initial task and its use in a high-resolution fMRI experiment on pattern separation
  • Toner, Pirogovsky, Kirwan, & Gilbert (2009), Learning and Memory: Use of this initial version to demonstrate that older participants have a relatively selective deficit in the pattern separation metric
  • Yassa, Lacy, Stark, Albert, Gallagher, & Stark (2010/2011), Hippocampus: Development of the matched stimulus sets and the continuous estimation of the “mnemonic similarity” for each target-lure pair. Demonstration of behavioral impairment in older adults and demonstration of increased DG/CA3 activity in a separation-related contrast in older adults.
  • Yassa, Stark, Bakker, Albert, Gallagher, & Stark (2010), NeuroImage: Using the continuous version, demonstration of a greater impairment in separation for aMCI patients than for age-matched controls. Demonstration of altered DG/CA3 activity at encoding and retrieval in aMCI relative to controls.
  • Lacy, Yassa, Stark, Muftuler, & Stark (2010/11), Learning and Memory: While primarily aimed at demonstrating different transfer functions in DG/CA3 vs. CA1 using the incidental version (and replicating Bakker et al., 2008), the Supplemental Materials have details on the “Mnemonic Similarity” ratings (odds of false alarming for each lure stimulus) used to match the sets. In addition, behavioral experiments showed relatively high confidence for both correct and incorrect Lures and that participants can distinguish even the closest lure pairs in a working memory version. The “Set C” and “Set D” in the task we distribute were derived here (they are a reshuffling of the stimuli in Set A and Set B to have two matched sets).
  • Yassa, Mattfeld, Stark, & Stark (2011), PNAS: Demonstration of a correlation between the separation bias metric and the integrity of the perforant path in older adults and of a correlation between the separation bias metric and a functional measure of separation in the DG/CA3 (based on signals generated in the implicit version of the task).
  • Bakker, Krauss, Albert, Speck, Jones, Stark, Yassa, Bassett, Shelton, & Gallagher (2012), Neuron: Replication of the aMCI impairment on the separation bias metric and the activity differences in the DG/CA3 from Yassa et al., (2010, NeuroImage). Demonstration of the restoration of both behavior and DG/CA3 activity using a low-dose of Levetiracetam.
  • Kirwan, Hartshorn, Stark, Goodrich-Hunsaker, Hopkins, & Stark (2012), Neuropsychologia: Demonstration of a selective impairment in the separation bias metric (vs. traditional recognition memory) in patients with hippocampal damage from anoxia.
  • Stark, Yassa, Lacy, & Stark (2013), Neuropsychologia: Full description of the task and history along with naming it the BPS-O task. Demonstration of a gradual age-related decline and of dissociations using the BPS metric and a recognition memory metric across “aged-impaired”, “aged-unimpaired” and aMCI.
  • Stark, SM, Stevenson, R, Wu, C, Rutledge, S, & Stark CEL (2015), Behavioral Neuroscience: A series of variations on the task, investigating ROC and d-prime analyses with an Old/New response version, continuous vs study/test format, and explicit test instructions prior to the study task. We demonstrate repeatedly that the age-related deficit in lure discrimination is robust to these task variations. It also shows that shortened versions can be used and that practice effects are not large concerns.
  • Stark SM & Stark CEL (2017), Behavioral Brain Research: A comparison of objects vs scenes as stimuli in the MST. We found that while there was an age-related impairment on lure discrimination for both objects and scenes, relationships with brain volumes and other measures of memory performance were stronger for objects.

The MST (BPSO) has always been available to researchers who have asked for our stimuli. Previously, we have provided basic instructions, the stimuli, and Matlab (or Octave) code written using PsychToolbox. While useful for some, there has been a relatively high cost of entry to use the task. We now provide more resources, including stand-alone applications for both Macs and Windows PCs. These are written in C++ (wxWidgets) and have many features not found in the older Matlab code.  In addition, there is currently a version written in Kivy Python that can be run not only on Macs, Windows, and Linux desktops but also on Android and iOS.  It is not as feature-rich yet, but is under development.

To stay up to date and to participate in the discussion of the MST, please join the Google Group


  • GitHub repository with C++ and PsychoPy source code, stimuli, instructions, etc.
  • Generic directory with everything (New location)
  • Stand-alone Mac version (v 0.96): All experimental code and stimuli. Requires OS X 10.7.
  • Stand-alone Windows version (v 0.96): All experimental code and stimuli. Requires Win7 or greater.
  • Want the stimuli?  Sets 1-6 are in the generic directory.  All of them are packaged in the stand-alone (Mac: Right-click, Show package contents; Win: Navigate to the install directory).
  • (older) Matlab version: All experimental code and stimuli. Requires PsychToolbox and Matlab (or Octave) – we use variants of this in the scanner but it’s not nearly so user-friendly
  • Video instructions: Video of instructions to be played for the test phase. Walks participants through the old/similar/new task with samples.  Several other variants of videos and the PowerPoints used to make the videos are available within the GitHub repository.
  • MS_ONS_Instructs: Word document with instructions for experimenters (see also Help in stand-alone applications)

Version history (Stand-alone)

0.96 (1/12/18) Mac version, Windows version

  • Fixed issue whereby the space bar would lead to responses of 11
  • Ditched the bogus da computation and shifted to accurate d’ calculation in two-choice
  • Added ‘Scenes C’ — Note the lure bins here are entirely bogus and this is not really for general consumption

0.94 (3/24/17) Mac version (re-uploaded 8/9/17), Windows version

  • Shuffled several stimuli across Sets 1-6 as it became clear that there was within-set overlap.  Swaps were all at the same “bin level”
  • Mac packaging now in .pkg format and shouldn’t complain about being from an “unidentified developer”
  • Windows version compiled on different machine (VC 2010 based) – let me know if there are issues.

0.93 (10/6/16) Mac version, Windows version

  • Italian translation added (Nicola Cellini)

0.92 (9/19/16) Mac versionWindows version

  • Custom “JS” mode added (F10 and restart to toggle). Allows for testing half the items in one session and the other half later
  • Fixed bug where Set C was not available.
  • Fixed bug with 40 item version
  • Fixed bug in saving / loading custom keys

0.91 (4/20/16) Mac versionWindows version

  • Added Sets 1-6 (reshuffles of C-G)
  • Old-new response option that calculates da (experimental – please test)
  • Cleaned up parameter window
  • (Windows 0.91 only): Can start the experiment with a mouse-click (useful on tablets)

0.82 (3/22/16) Mac versionWindows version

  • Windows touch-screen support for response buttons
  • French, German, and Spanish translations (thanks Christine Bastin!)
  • Better able to deal with high-DPI monitors

0.8 (9/15/14)  Mac versionWindows version

  • Renamed MST (Mnemonic Similarity Task – used in pubs prior to Stark et al., 2013).
  • Sets E and F added – Solid sets, but may not be perfectly matched to C and D.  May be, but may not be.
  • 40-item version added back in
  • Code signing compatible with OS X Yosemite

0.7 (8/22/13)

  • Randomization parameter added to dialog to let you control how the randomization of stimuli to Targ/Lure/Foil lists happens. Old behavior (based on ID) and several fixed “seeds” as options.
  • Internationalization code in place. Currently, Chinese and Swedish options available. Others just need someone to help translate a few phrases.
  • Can now use subets of shorter lists. So, if you use 20 items, you have 3 sublist choices and if you use 32 items you have 2 sublist choices. This lets you get more independent runs per participant, so long as you’re using a shorter version.
  • On main logo screen you can now test the response buttons by showing you “Resp #” in the status bar briefly after a button is pressed.
  • Can now customize the response key options (allowing for any of several keys to be used for each response

0.6 (9/11/12)

  • Reworked the way stimuli were assigned to target, repeat, and lure conditions. Random assignment based on subject ID ensuring that there is an even distribution across lure bins. (Gone is the multiple-of-6 blocking using 1-64, 65-128, and 129-192 as blocks of stimuli assigned to these conditions).
  • Now able to be run in shorter versions. The default is still 64 stimuli per condition but as few as 8 are now possible. This is for testing purposes to determine how short the task can go. Eventually, if shorter versions are used so that many runs can be done in an individual, we’ll need to ensure that the stimuli don’t repeat. This isn’t in place yet.
  • Final version of the bin files (SetC bins.txt and SetD bins.txt) in place. Prior one was well correlated with these but not the final assignment of stimuli to bins we’ve used in publications.

0.5 (9/6/12)

  • Fixed major bug: The lure bins were not being properly read and applied. Set D’s lure bins were being read improperly (Set C’s were read in place of Set D) and in both the matching of item to lure bin was off by 1 in the list (0-indexed vs. 1-indexed). Calculation of the BPS-O metric is unaffected by this bug but any existing data in which the lure bin info is needed (e.g., tracing out a tuning curve for a subject) will need to be recalculated manually as the summary table at the end by lure bin will be incorrect. The Matlab code is not affected here – only this beta-level stand-alone code has this bug.

0.4 (8/6/12)

  • Renamed to BPSO (from SPST)
  • Summary at the end will correct for no-responses (as well as show rates raw)
  • Fixed bug that could cause crash on exit

0.3 (3/8/12)

  • A “self-paced” option is now available. The ISI will be at least what is specified but will extend forever when this is selected. So, it’s not on the screen any longer, but we can give extra time for people to respond.
  • Response options extended to allow for larger hands. Now V,C,1=1 B,2=2 N,M,3=3

v 0.2 (11/30/2011)

  • Fixed bug leading to multiple presentations of each trial and improved summary statistic logging

v 0.1 (11/18/2011)

  • Initial beta release