Measurement of visual memory is not always confounded by verbalization

If you study visual memory, you can bet that you will encounter skepticism about whether your participants really remembered visual images. Maybe they only remembered verbal labels corresponding to your stimuli. Certainly, there are many published studies of “visual” memory that merit skepticism on this point. Unlike studying verbal memory, where participants can convey what phonemes they are thinking of with high fidelity via speech, communication of visual memories must be mediated by speech or gesture. You cannot directly show someone else what the mental image you remember looks like. In visual memory research, it is often unclear whether we are measuring memory for what something looked like based on representation of a visual image or a representation of how we intend to communicate what something looked like, two completely different things. 

One method that minimizes these concerns is visual change detection. Visual change detection paradigms use arbitrary and unique arrangements of colored shapes as stimuli. Participants view the stimuli very briefly and indicate whether the test item has changed or not by pressing a button. This plausibly minimizes dependence on verbal labels for several reasons. Simply remembering all the color or shape names would not reliably lead to correct responses because it is also important to know where they were situated. Exposures are too brief for all the components (plus some spatial cue) to be named. Responses are not spoken, which should further discourage verbal labeling. And if you’re still wary of the possibility of verbal labeling, you can make participants engage in articulatory suppression, where they repeat some irrelevant and easy-to-remember sequence (e.g., “super, super, super”, “the, the, the”) during the task to further discourage verbalization. 

Imposing articulatory suppression is a pain for the researchers collecting data. Participants understandably do not like continuously speaking aloud, and researchers have to actually monitor them to ensure they comply. I’ve done my time in this respect. Most of the participants I ran for my MSc and PhD research engaged in articulatory suppression. I’ve spent hundred of hours listening to research participants chant arbitrary sequences. I’ve commiserated with them about how tiresome it is. Once, I excused a participant after he spontaneously switched to droning a novel sequence of curse words. (I wasn’t offended by the profanity, but I judged that this insubordination reflected a withdrawal of consent.) Sometimes I had a good reason, grounded in hypothesis testing, for imposing articulatory suppression. Usually though, I decided to impose suppression only on the chance that a peer reviewer might not be persuaded that I had really measured “visual” memory unless participants had been suppressing articulation. I know I’m not the only visual memory researcher who has made this decision.

My colleagues and I (led by Rijksuniversiteit Groningen PhD candidate Florian Sense) have recently shown that this precaution is unnecessary in a paper in Behavior Research Methods. We compared performance on a typical visual change detection task with performance on a modified task designed to increase the likelihood that participants would try to generate verbal labels. In the modified task, colored items were presented one-by-one so that participants could more easily verbalize them. Participants completed half the trials silently, which should allow opportunity for labeling, and half with articulatory suppression, which should discourage labeling. If labeling occurs and boosts memory for these stimuli, then we should have observed better performance with sequential presentation and no suppression. However, this wasn’t the case. Our state-trace analyses, designed to detect whether we were measuring one mnemonic process (presumably maintenance of visual imagery) or multiple mnemonic processes (maintenance of imagery plus maintenance of verbal labels) yielded consistent evidence across all participants favoring the simpler explanation that a single mnemonic process was measured in all conditions. 

I don’t doubt that there are circumstances where verbal labels influence memory for visual imagery, both for better and for worse. That is not in dispute. But just because there are instances where we know this happens doesn’t make it reasonable to claim that verbal labeling is always probable. When stimuli are unfamiliar and abstract, and exposures are brief, visual change detection paradigms appear to offer a pure measure of memory for visual imagery uncontaminated by verbal labeling.