[an error occurred while processing this directive] [an error occurred while processing this directive]
Spatial constraints on learning in visual search: Modeling contextual cueing.
Predictive visual context facilitates visual search, a benefit termed contextual cueing (M. M. Chun & Y. Jiang, 1998). In the original task, search arrays were repeated across blocks such that the spatial configuration (context) of all of the distractors in a display predicted an embedded target location. We modeled existing results using a connectionist architecture, and then designed new behavioral experiments to test the model's assumptions. The modeling and behavioral results indicate that learning may be restricted to the local context even when the entire configuration is predictive of target location. Local learning constrains how much guidance is produced by contextual cueing. The modeling and new data also demonstrate that local learning requires that the local context maintain its location in the overall global context.
[an error occurred while processing this directive]
The contents of perceptual hypotheses: evidence from rapid resumption of interrupted visual search.
It is possible to resume a previously interrupted visual search trial significantly faster than starting a new search trial (Lleras, Rensink, & Enns, 2005). This rapid resumption of search is possible because evidence accumulated during the previous exposure can carry over to a subsequent presentation. We present four interrupted visual search experiments that characterize the content of accumulated memory within visual search trials. These experiments reveal that prior to explicit target identification, subjects have accumulated evidence about the location of a subset of task-relevant distractors, but not their identity, as well as sub-response-threshold evidence for the identity of the target. Our results characterize the content of accumulated memory within search trials and highlight the utility of interrupted search for studying online search processing prior to target identification.
[an error occurred while processing this directive]
Statistical learning using real-world scenes: extracting categorical regularities without conscious intent.
Recent work has shown that observers can parse streams of syllables, tones, or visual shapes and learn statistical regularities in them without conscious intent (e.g., learn that A is always followed by B). Here, we demonstrate that these statistical-learning mechanisms can operate at an abstract, conceptual level. In Experiments 1 and 2, observers incidentally learned which semantic categories of natural scenes covaried (e.g., kitchen scenes were always followed by forest scenes). In Experiments 3 and 4, category learning with images of scenes transferred to words that represented the categories. In each experiment, the category of the scenes was irrelevant to the task. Together, these results suggest that statisticallearning mechanisms can operate at a categorical level, enabling generalization of learned regularities using existing conceptual knowledge. Such mechanisms may guide learning in domains as disparate as the acquisition of causal knowledge and the development of cognitive maps from environmental exploration.
[an error occurred while processing this directive]
Visual long-term memory has a massive storage capacity for object details.
One of the major lessons of memory research has been that human memory is fallible, imprecise, and subject to interference. Thus, although observers can remember thousands of images, it is widely assumed that these memories lack detail. Contrary to this assumption, here we show that long-term memory is capable of storing a massive number of objects with details from the image. Participants viewed pictures of 2500 objects over the course of 5.5 hours. Afterwards, they were shown pairs of images, and indicated which of the two they had seen. The previously viewed item could be paired with either an object from a novel category, an object of the same basic level category, or the same object in a different state or pose. Performance in each of these conditions was remarkably high (92%, 88%, 87%, respectively), suggesting participants successfully maintained detailed representations of thousands of images. These results have implications for cognitive models in which capacity limitations impose a primary computational constraint (e.g., models of object recognition), and pose a challenge to neural models of memory storage and retrieval, which must be able to account for such a large and detailed storage capacity.
[an error occurred while processing this directive]
Efficient Coding in Visual Short-Term Memory: Evidence for an Information-Limited Capacity.
Previous work on visual short-term memory (VSTM) capacity has typically used patches of color or simple features which are drawn from a uniform distribution, and estimated the capacity of VSTM to be 3-4 items (Luck & Vogel, 1997). Here, we introduce covariance information between colors, and ask if VSTM can take advantage of this redundancy to form a more efficient representation of the displays. We find that observers can successfully remember 5 colors on these displays, significantly higher than the 3 colors remembered when the displays were changed to be uniformly distributed in the final block of the experiment. We suggest that quantifying capacity in terms of number of objects remembered fails to capture factors such as object complexity or statistical redundancy, and that information theoretic measures are better suited to characterizing the capacity of VSTM. We use Huffman coding to model our data, and demonstrate that the data are consistent with a fixed VSTM capacity in bits rather than in terms of number of objects.
[an error occurred while processing this directive]
Detecting changes in real-world objects: The relationship between visual long-term memory and change blindness
A large body of literature has shown that observers often fail to notice significant changes in visual scenes, even when these changes happen right in front of their eyes. For instance, people often fail to notice if their conversation partner is switched to another person, or if large background objects suddenly disappear [1,2]. These 'change blindness' studies have led to the inference that the amount of information we remember about each item in a visual scene may be quite low [1]. However, in recent work we have demonstrated that long-term memory is capable of storing a massive number of visual objects with significant detail about each item [3]. In the present paper we attempt to reconcile these findings by demonstrating that observers do not experience 'change blindness' with the real world objects used in our previous experiment if they are given sufficient time to encode each item. Our results (see also refs. 4 and 5) suggest that one of the major causes of change blindness for real-world objects is a lack of encoding time or attention to each object.
[an error occurred while processing this directive]
Entrainment to music requires vocal mimicry: Evidence from non-human animals
The human music capacity consists of certain core phenomena, including the tendency to entrain, or align movement, to an external auditory pulse [1-3]. This ability, fundamental both for music production and for coordinated dance, is repeatedly highlighted as uniquely human [4-11]. However, it has recently been hypothesized that entrainment evolved as the byproduct of vocal mimicry, generating the strong prediction that only vocal mimicking animals may be able to entrain [12,13]. We provide comparative data demonstrating the existence of two proficient vocal mimicking non-human animals (parrots) that entrain to music, spontaneously producing synchronized movements resembling human dance. We also provide an extensive comparative dataset from a global video database, systematically analyzed for evidence of entrainment in hundreds of species both capable and incapable of vocal mimicry. Despite the higher representation of vocal non-mimics in the database and comparable exposure of mimics and non- mimics to humans and music, only vocal mimics showed evidence of entrainment. We conclude that entrainment is not unique to humans, and that the distribution of entrainment across species supports the hypothesis that entrainment evolved as a byproduct of selection for vocal mimicry.
[an error occurred while processing this directive]
Compression in visual short-term memory: using statistical regularities to form more efficient memory representations
The information we can hold in working memory is quite limited, but this capacity has typically been studied using simple objects or letter strings with no associations between them. However, in the real world there are strong associations and regularities in the input. In an information theoretic sense, regularities introduce redundancies that make the input more compressible. Here we show that observers can take advantage of these redundancies, enabling them to remember more items in working memory. In two experiments, we introduced covariance between colors in a display so that over trials some color pairs were more likely than other color pairs. Observers remembered more items from these displays than when the colors were paired randomly. The improved memory performance cannot be explained by simply guessing the high probability color pair, suggesting that observers formed more efficient representations to remember more items. Further, as observers learned the regularities their working memory performance improved in a way that is quantitatively predicted by a Bayesian learning model and optimal encoding scheme. We therefore suggest that the underlying capacity of their working memory is unchanged, but the information they have to remember can be encoded in a more compressed fashion.
[an error occurred while processing this directive]
Ensemble statistics of a display influence the representation of items in visual working memory
Influential models of visual working memory treat each item to be recalled as an independent unit and assume there are no interactions between items. However, in the real world the displays themselves have structure, providing constraints on the items to be remembered. To examine this scenario, we looked at the influence of an ensemble statistic -the mean size of a set of items- on visual working memory. We find evidence that the remembered size of each individual item is biased toward the mean size of the set. This suggests items in visual working memory are not recalled merely as independent units.
[an error occurred while processing this directive]
Conceptual distinctiveness supports detailed visual long-term memory for real-world objects
Humans have a massive capacity to store detailed information in visual long-term memory. The present studies explored the fidelity of these visual long-term memory representations, and examined how conceptual and perceptual features of object categories support this capacity. Observers viewed 2800 object images with a different number of exemplars presented from each category. At test, observers indicated which of two exemplars they had previously studied. Memory performance was high and remained quite high (82% accuracy) with 16 exemplars from a category in memory, demonstrating a large memory capacity for object exemplars. However, memory performance decreased as more exemplars were held in memory, implying systematic categorical interference. Object categories with conceptually distinctive exemplars showed less interference in memory as the number of exemplars increased. Interference in memory was not predicted by the perceptual distinctiveness of exemplars from an object category, though these perceptual measures predicted visual search rates for an object target among exemplars. These data provide evidence that observers’ capacity to remember visual information in long-term memory depends more on conceptual structure than perceptual distinctiveness.
[an error occurred while processing this directive]
Encoding higher-order structure in visual working memory: A probabilistic model
When encoding a scene into memory, people store both the overall gist of the scene and detailed information about a few specific objects. Moreover, they use the gist to guide their choice of which specific objects to remember. However, formal models of change detection, like those used to estimate visual working memory capacity, generally assume people represent no higher-order structure about the display and choose which items to encode at random. We present a probabilistic model of change detection that attempts to bridge this gap by formalizing the encoding of both specific items and higher-order information about simple working memory displays. We show that this model successfully predicts change detection performance for individual displays of patterned dots. More generally, we show that it is necessary for the model to encode higher-order structure in order to accurately predict human performance in the change detection task. This work thus confirms and formalizes the role of higher-order structure in visual working memory.
[an error occurred while processing this directive]
Scene memory is more detailed than you think: the role of categories in visual long-term memory
Observers can store thousands of object images in visual long-term memory with high fidelity, but the fidelity of scene representations in long-term memory is not known. Here we probed this fidelity by varying the number of studied exemplars in each scene category and testing memory using exemplar-level foils. Observers viewed thousands of scenes over 5.5 hours, and were tested with a series of forced-choice tests. Memory performance was high, even with up to 64 scenes from the same category in memory. Moreover, there was only a 2% decrease in accuracy for each doubling of the number of studied scene exemplars. Surprisingly, this degree of categorical interference was similar to that previously demonstrated in object memory. Thus, while scenes have often been defined as a superset of objects, our results suggest that scenes and objects may be best treated as entities at a similar level of abstraction in visual long-term memory.
[an error occurred while processing this directive]
Hierarchical encoding in visual working memory: ensemble statistics bias memory for individual items
Influential models of visual working memory treat each item to be stored as an independent unit and assume there are no interactions between items. However, real-world displays have structure, providing higher-order constraints on the items to be remembered. Even displays with simple colored circles contain statistics, like the mean circle size, that can be computed by observers to provide an overall summary of the display. Here we examine the influence of such an ensemble statistic on visual working memory. We find evidence that the remembered size of each individual item is biased toward the mean size of the set of items in the same color, and the mean size of all items in the display. This suggests that visual working memory is constructive, encoding the display at multiple levels of abstraction and integrating across these levels rather than maintaining a veridical representation of each item independently.
[an error occurred while processing this directive]
Disentangling scene content from spatial boundary: Complementary roles for the PPA and LOC in representing real-world scenes
Behavioral and computational studies suggest that visual scene analysis rapidly produces a rich description of both the objects and the spatial layout of surfaces in a scene. However, there is still a large gap in our understanding of how the human brain accomplishes these diverse functions of scene understanding. Here we probe the nature of real-world scene representations using multi-voxel fMRI pattern analysis. We show that natural scenes are analyzed in a distributed and complementary manner by the parahippocampal place area (PPA) and the lateral occipital complex (LOC) in particular, as well as other regions in the ventral stream. Specifically, we study the classification performance of different scene-selective regions using images that vary in spatial boundary and naturalness content. We discover that whereas both the PPA and LOC can accurately classify scenes, they make different errors: the PPA more often confuses scenes that have the same spatial boundaries, whereas the LOC more often confuses scenes that have the same content. By demonstrating that visual scene analysis recruits distinct and complementary high-level representations, our results testify to distinct neural pathways for representing the spatial boundaries and content of a visual scene.
[an error occurred while processing this directive]
A review of visual memory capacity: Beyond individual items and towards structured representations
Traditional memory research has focused on identifying separate memory systems and exploring different stages of memory processing. This approach has been valuable for establishing a taxonomy of memory systems and characterizing their function, but has been less informative about the nature of stored memory representations. Recent research on visual memory has shifted towards a representation-based emphasis, focusing on the contents of memory, and attempting to determine the format and structure of remembered information. The main thesis of this review will be that one cannot fully understand memory systems or memory processes without also determining the nature of memory representations. Nowhere is this connection more obvious than in research that attempts to measure the capacity of visual memory. We will review research on the capacity of visual working memory and visual long-term memory, highlighting recent work that emphasizes the contents of memory. This focus impacts not only how we estimate the capacity of the system - going beyond quantifying how many items can be remembered, and moving towards structured representations - but how we model memory systems and memory processes.
[an error occurred while processing this directive]
Real-world objects are not represented as bound units: Independent forgetting of different object details from visual memory.
Are real-world objects represented as bound units? While a great deal of research has examined binding between the feature dimensions of simple shapes, little work has examined whether the featural properties of real-world objects are stored in a single unitary object representation. In a first experiment, we find that information about an object's color is forgotten more rapidly than the information about an object's state (e.g. open, closed), suggesting that observers do not forget objects as entirely bound units. In a second and third experiment, we examine whether state and exemplar information are forgotten separately or together. If these properties are forgotten separately, then the probability of getting one feature correct should be independent of whether the other feature was correct. We find that after a short delay, observers frequently remember both state and exemplar information about the same objects, but after a longer delay, memory for the two properties becomes independent. This indicates that information about object state and exemplar are forgotten separately over time. We thus conclude that real-world objects are not represented in a single unitary representation in visual memory.
[an error occurred while processing this directive]
A probabilistic model of visual working memory: Incorporating higher-order regularities into working memory capacity estimates.
When remembering a real-world scene, people encode both detailed information about specific objects and higher-order information like the overall gist of the scene. However, formal models of change detection, like those used to estimate visual working memory capacity, assume observers encode only a simple memory representation which includes no higher-order structure and treats items independently from each other. We present a probabilistic model of change detection that attempts to bridge this gap by formalizing the role of perceptual organization and allowing for richer, more structured memory representations. Using either standard visual working memory displays or displays in which the items are purposefully arranged in patterns, we find that models which take into account perceptual grouping between items and the encoding of higher-order summary information are necessary to account for human change detection performance. Considering the higher-order structure of items in visual working memory will be critical for models to make useful predictions about observers' memory capacity and change detection abilities in simple displays as well as in more natural scenes.
[an error occurred while processing this directive]
Visual long-term memory has the same limit on fidelity as visual working memory.
Visual long-term memory can store thousands of objects with surprising visual detail, but just how detailed are these representations and how can we quantify this fidelity? Using the property of color as a case study, we estimated the precision of visual information in long-term memory, and compared this to the precision of the same information in working memory. Observers were shown real-world objects in a random color, and then ask to recall the color after a delay. We quantified two parameters of performance: the variability of internal representations of color (fidelity) and the probability of forgetting an object's color altogether. Surprisingly, the data show that the fidelity of color information in long-term memory was comparable to the asymptotic precision of working memory. These results suggest that a common limit may constrain both long-term memory and working memory, such as a bound on the fidelity required to retrieve memory representations.
[an error occurred while processing this directive]
Spatial frequency integration during active perception: perceptual hysteresis when an object recedes.
As we move through the world, information about objects moves to different spatial frequencies. How the visual system successfully integrates information across these changes to form a coherent percept is thus an important open question. Here we investigate such integration using hybrid faces, which contain different images in low and high spatial frequencies. Observers judged how similar a hybrid was to each of its component images while walking toward or away from it or having the stimulus moved toward or away from them. We find that when the stimulus is approaching, observers act as if they are integrating across spatial frequency separately at each moment. However, when the stimulus is receding, observers show a perceptual hysteresis effect, holding on to details that are imperceptible in a static stimulus condition. Thus, observers appear to make optimal inferences by sticking with their previous interpretation when losing information but constantly reinterpreting their input when gaining new information.
[an error occurred while processing this directive]
Modeling visual working memory with the MemToolbox.
The MemToolbox is a collection of MATLAB functions for modeling visual working memory. In support of its goal to provide a full suite of data analysis tools, the toolbox includes implementations of popular models of visual working memory, real and simulated data sets, Bayesian and maximum likelihood estimation procedures for fitting models to data, visualizations of data and fit, validation routines, model comparison metrics, and experiment scripts. The MemToolbox is released under the permissive BSD license and is available at memtoolbox.org.
[an error occurred while processing this directive]
Terms of the debate on the format and structure of visual memory.
Our ability to actively maintain information in visual memory is strikingly limited. There is considerable debate about why this is so. As with many questions in psychology, the debate is framed dichotomously: is visual working memory limited because it is supported by aonly a small handful of discrete "slots" into which visual representations are placed, or is it because there is an insufficient supply of a limited "resource" that is flexibly shared among visual representations? Here, we argue that this dichotomous framing obscures a set of at least eight underlying questions. Separately considering each question reveals a rich hypothesis space that will be useful for building a comprehensive model of visual working memory. The questions regard (1) an upper limit on the number of represented items; (2) the quantization of the memory commodity; (3) the relationship between how many items are stored and how well they are stored; (4) whether the number of stored items completely determines the fidelity of a representation (versus fidelity being stochastic or variable); (5) the flexibility with which the memory commodity can be assigned or reassigned to items; (6) the format of the memory representation; (7) how working memories are formed; and (8) how memory representations are used to make responses in behavioral tasks. We reframe the debate in terms of these eight underlying questions, placing slot and resource models as poles in this a larger more expansive theoretical space.
[an error occurred while processing this directive]
No evidence for a fixed object limit in working memory: Spatial ensemble representations inflate estimates of working memory capacity for complex objects.
A central question for models of visual working memory is whether the number of objects people can remember depends on object complexity. Some influential 'slot' models of working memory capacity suggest that people always represent 3-4 objects and only the fidelity with which these objects are represented is affected by object complexity. The primary evidence supporting this claim is the finding that people can detect large changes to complex objects (consistent with remembering at least 4 individual objects), but that small changes cannot be detected (consistent with low-resolution representations). Here we show that change detection with large changes greatly overestimates individual item capacity when people can use global representations of the display to detect such changes. When the ability to use such global ensemble or texture representations is reduced, people remember individual information about only 1-2 complex objects. This finding challenges models that propose people always remember a fixed number of objects, regardless of complexity, and supports a more flexible model with an important role for spatial ensemble representations.
[an error occurred while processing this directive]
Individual differences in ensemble perception reveal multiple, independent levels of ensemble representation.
Ensemble perception, including the ability to 'see the average' from a group of items, operates in numerous feature domains (size, orientation, speed, facial expression, etc.). While the ubiquity of ensemble representations is well established, the large-scale cognitive architecture of this process remains poorly defined. We address this using an individual differences approach. In a series of experiments, observers saw groups of objects and reported either a single item from the group or the average of the entire group. High-level ensembles representations (e.g., average facial expression) showed complete independence from low-level ensemble representations (e.g., average orientation). In contrast, low-level ensemble representations (e.g., orientation and color) were correlated with each other, but not with high-level ensemble representations (e.g., facial expression and person identity). These results suggest that there is not a single domain-general ensemble mechanism, and that the relationship among various ensemble representations depends on how proximal they are in representational space.
[an error occurred while processing this directive]
Contextual effects in visual working memory reveal hierarchically structured memory representations.
Influential slot and resource models of visual working memory make the assumption that items are stored in memory as independent units, and that there are no interactions between them. Consequently, these models predict that the number of items to be remembered (the set size) is the primary determinant of working memory performance, and therefore these models quantify memory capacity in terms of the number and quality of individual items that can be stored. Here we demonstrate that there is substantial variance in display difficulty within a single set size, suggesting that limits based on the number of individual items alone cannot explain working memory storage. We asked hundreds of participants to remember the same sets of displays, and discovered that participants were highly consistent in terms of which items and displays were hardest or easiest to remember. Although a simple grouping or chunking strategy could not explain this individual-display variability, a model with multiple, interacting levels of representation could explain some of the display-by-display differences. Specifically, a model that includes a hierarchical representation of items plus the mean and variance of sets of the colors on the display successfully accounts for some of the variability across displays. We conclude that working memory representations are composed only in part of individual, independent object representations, and that a major factor in how many items are remembered on a particular display is interitem representations such as perceptual grouping, ensemble, and texture representations.
[an error occurred while processing this directive]
Working memory is not fixed capacity: More active storage capacity for real-world objects than simple stimuli.
Visual working memory is the cognitive system that holds visual information active to make it resistant to interference from new perceptual input. Information about simple stimuli - colors, orientations - is encoded into working memory rapidly: in under 100ms, working memory 'fills up', revealing a stark capacity limit. However, for real-world objects, the same behavioral limits do not hold: with increasing encoding time, people store more real-world objects and do so with more detail. This boost in performance for real-world objects is generally assumed to reflect the use of a separate episodic long-term memory system, rather than working memory. Here we show that this behavioral increase in capacity with real-world objects is not solely due to the use of separate episodic long-term memory systems. In particular, we show that this increase is a result of active storage in working memory, as shown by directly measuring neural activity during the delay period of a working memory task using EEG. These data challenge fixed capacity working memory models, and demonstrate that working memory and its capacity limitations are dependent upon our existing knowledge.
[an error occurred while processing this directive]
The effects of local context in visual search
Visual context information constrains what to expect and where to look, facilitating search for objects embedded in complex displays. The original contextual cueing paradigm (Chun & Jiang, 1998) showed that observers implicitly learn the global configuration of targets in artificial visual search tasks and that this context can serve to cue the target location and facilitate search performance in subsequent encounters.

I propose a computational model of this type of learning, and suggest that the majority of spatial contextual cueing effects can be accounted for using a model with only two major constraints: pair-wise learning of target-distractor relations, and a set of spatial constraints.

I then present a series of experiments designed to test the extent of the spatial constraints necessary for modeling contextual cueing, and examine how such a model of contextual cueing can account for the major results in the contextual cueing literature. I next look at the predictions such a model makes about learning in visual search more generally, and present evidence from an interrupted visual search task suggesting that similar constraints control learning during a normal visual search trial.

[an error occurred while processing this directive]
Automatic and implicit encoding of scene gist.
One of the primary goals of the visual system is to extract statistical regularities from the environment to build a robust representation of the world. Recent research on visual statistical learning (VSL) has demonstrated that human observers can implicitly extract joint probabilities between objects during streams of visual stimuli. In the real world, temporal predictability between scenes and places exists at both exemplar and categorical levels: whatever office you are in, the probability that you will step out in a zoo is much lower than the probability that you will enter a corridor. In a series of experiments, we tested to what extent people are sensitive to the learning of categorical temporal regularities based on the gist or semantic understanding of natural scenes. Our results suggest that the gist of a scene is automatically and implicitly extracted even when it is not task-relevant, and that implicit statistical learning can occur at a level as abstract as the conceptual gist representation.
[an error occurred while processing this directive]
Perceptual Organization Across Spatial Scales In Natural Images.
Visual perception and recognition are problems of induction - problems where we are given ambiguous input and must decide which of many possible interpretations to take. In middle-level vision, this is typically referred to as the problem of perceptual organization: how we take the bits and pieces of visual information that are available in the retinal image and structure them into larger units like objects.

In this talk I will present data on perceptual organization across different spatial frequencies - how the blurry, low spatial frequency of an image and the fine, high spatial frequency details in that image interact to form our eventual percept. Our visual system is thought to break down images by spatial frequency early on in the visual pathway, so examining perceptual organization across spatial frequencies should allow us to get a better idea of the types of integration the visual system has to deal with as it builds a representation of the world.

To examine this process of integration across spatial scales, we will use hybrid images made up of different images at different spatial frequencies. In the first part of the talk, I will discuss basic properties of integration across spatial scale: how the amount of information from each of the images affects the way we see the hybrid, and how this interacts with our contrast sensitivity function. In the second part I will present data on the effects of top-down knowledge on our organization of images across spatial frequencies, and will suggest that our visual system is designed to integrate across spatial frequencies in a way that provides us with the most accurate interpretation of the world as we move through it.

[an error occurred while processing this directive]
Tracking Statistical Regularities to Form More Efficient Memory Representations.
A central task of the visual system is to take the information available in the retinal image and compress it to form a more efficient representation of the world. Such compression requires sensitivity to the statistical distribution that stimuli are drawn from, in order to detect redundancy and eliminate it.

In the first part of the talk I will discuss work on statistical learning mechanisms (e.g., Saffran et al. 1996) that suggests how people might track such distributions of stimuli in the real world. I'll present several experiments that use sequences of natural images to demonstrate that such statistical learning mechanisms operate at multiple levels of abstraction, including the level of semantic categories. I'll discuss how learning at this abstract level allows us to minimize redundancy by not relearning the same regularities over and over again.

In the second part of the talk I will suggest another potential benefit to such statistical learning mechanisms - the ability to remember more items in visual short-term memory. I'll present several experiments where we show that observers can take advantage of relationships between colors in VSTM displays, eliminating redundant information to form more efficient representations of the displays. I'll then present a model of this data using Huffman coding, a compression algorithm, to demonstrate that quantifying VSTM in terms of the bits of information remembered is more useful than measuring the number of objects remembered (the most common metric).

[an error occurred while processing this directive]
Remembering Thousands of Natural Images With High Fidelity.
The human visual system has been extensively trained to deal with objects and natural images, giving it the opportunity to develop robust strategies to quickly identify categories and exemplars. Although it is known that the memory capacity for images is massive (Standing, 1973), the fidelity with which human memory can represent such a large number of images is an outstanding question. We conducted three large-scale memory experiments to determine the details remembered per item, by systematically varying the amount of detail required to succeed in subsequent memory tests. Our results show that contrary to the commonly accepted view that long-term memory representations contain only the gist of what was seen, long-term memory can store thousands of items with a large amount of detail per item. Further, item analyses reveal that memory for an object or a natural image depends on the extent to which it is conceptually distinct from other items in the memory set, and not necessarily on the featural distinctiveness along shape or color dimensions. These findings suggest a ?conceptual hook? is necessary for maintaining a large number of high-fidelity representations. Altogether, the results present a great challenge to models of object and natural scene recognition, which must be able to account for such a large and detailed storage capacity.
[an error occurred while processing this directive]
Tracking Statistical Regularities to Form More Efficient Memory Representations
A central task of the visual system is to take the information available in the retinal image and compress it to form a more efficient representation of the world. Such compression requires sensitivity to the statistical distribution that stimuli are drawn from, in order to detect redundancy and eliminate it (Shannon, 1948). But how sensitive are people to complicated and abstract statistical regularities? And how robustly do we use these regularities to form compressed representations?

In the first part of the talk I will discuss work on statistical learning mechanisms (e.g., Saffran et al. 1996) that demonstrates how people track the distribution of images over time in the real world. I'll present several experiments that use sequences of natural images to demonstrate that such statistical learning mechanisms operate at multiple levels of abstraction, including the level of semantic categories. I'll discuss how learning at this abstract level allows us to minimize redundancy by not relearning the same regularities over and over again.

In the second part of the talk I will suggest another potential benefit to such statistical learning mechanisms the ability to remember more items in visual short-term memory (VSTM). To demonstrate how observers can take advantage of learned regularities for the purpose of compression, I'll present several experiments where we show that observers can take advantage of relationships between colors in VSTM displays to eliminate redundant information and form more efficient representations. I'll then present a model of this data using Huffman coding, a compression algorithm, to demonstrate that quantifying VSTM in terms of the bits of information remembered is more accurate than measuring the number of objects remembered (the most common metric).

[an error occurred while processing this directive]
Perceptual organization across spatial scales in natural images: Seeing more high spatial frequency than meet the eyes
One of the most robust statistical properties of natural images is that contours are correlated across spatial frequency bands. However, the rules of perceptual grouping across spatial scales might be different as the observer approaches an object (adding HSF), or steps away from it (losing HSF). We manipulated contiguity across spatial scales by using hybrid images that combined the LSF and HSF of two different images. Some hybrids perceptually grouped well (e.g. two faces), and others did not (e.g. a highway and bedroom). In Experiment 1, observers performed a 2-AFC task while walking towards or away from the hybrids, judging how similar the hybrid was to each of its component images. In Experiment 2, conditions of an object moving towards or away were simulated by having images zooming in and out. Results in all experiments showed that when the observer and object are approaching each other, observers represent object SF content as predicted by their contrast sensitivity function: they add HSF to their representation at the appropriate rate. However, when observers or objects are receding from each other, observers show a perceptual hysteresis, hanging on to more of the high spatial frequency image than they can see (23% real vs. 50% perceived). This hysteresis effect is predicted by the strength of perceptual grouping between scale spaces. As we move through the world and attend to objects, we are constantly adding and losing information from different spatial scales. Our results suggest different mechanisms of on-line object representation: we tend to stick with our first grouping interpretation if we are losing information, and tend to constantly reinterpret the representation if we are gaining information.
[an error occurred while processing this directive]
Remembering Thousands of Objects with High Fidelity
Although people can remember a massive number of pictures (e.g.10,000 in Standing, 1973), the fidelity with which human memory can represent such a large number of items has not been tested. Most researchers in visual cognition have assumed that in such studies, only the gist of images were remembered and the details were forgotten. We conducted two large-scale memory experiments to determine the details remembered per item, by systematically varying the amount of detail required to succeed in subsequent memory tests. In the first study, 2500 conceptually distinct objects were presented for 3 second each. Afterwards, observers reported with remarkable accuracy which of two items they had seen when the foil was a categorically-novel item (92%), an item of the same basic level category (87%), or the same item in a different state or pose (87%). In the second study, 2560 items were presented and the number of exemplars presented from each category varied from 1 to 16. Observers reported which exemplar they had seen for categories with 1 previously viewed exemplar (87%) and maintained high accuracy even for categories with 16 previously viewed exemplars (80%). Thus, contrary to the commonly accepted view that long-term memory representations contain only the gist of what was seen, we demonstrate that long-term memory can store thousands of items with a large amount of detail per item. Further, item analyses reveal that memory for an object depends on the extent to which it is conceptually distinct from other items in the memory set, and not on the featural distinctiveness along shape or color dimensions. These findings suggest a "conceptual hook" is necessary for maintaining the large number of high-fidelity memory representations, and imply that the precision of visual content in long-term memory is determined by conceptual and not perceptual structure.
[an error occurred while processing this directive]
Compression in Visual Short-term Memory: Using Statistical Regularities to Form More Efficient Memory Representations
It is widely accepted that our visual systems are tuned to the statistics of input from the natural world, which suggests that our visual short-term memory may also take advantage of statistical regularities through efficient coding schemes. Previous work on VSTM capacity has typically used patches of color or simple features which are drawn from a uniform distribution, and estimated the capacity of VSTM for simple color patches to be ~ 4 items (Luck & Vogel, 1997), and even fewer for more complex objects (Alvarez & Cavanagh, 2004). Here, we introduce covariance information between colors, and ask if VSTM can take advantage of the shared statistics to form a more efficient representation of the displays.

We presented observers with displays of eight objects, presented in pairs around the fixation point, and then probed a single object in an eight-alternative forced-choice test. The displays were constructed so that each of the eight possible colors appeared in every display, but the color they were next to was not random -- each color had a high probability pair (e.g. red appeared with green 80% of time). In information theoretic terms, the displays with statistical regularities have lower entropy compared with uniform displays, and thus require less information to encode. We found that observers could successfully remember 5.5 colors on these displays, significantly higher than the 3.5 colors remembered when the displays were changed to be uniformly distributed in the last block. These results show that capacity estimates, measured in number of objects, actually increased when the displays had some statistical regularities, and that VSTM capacity is not a fixed number of items. We suggest that quantifying capacity in number of objects fails to capture factors such as object complexity or statistical information, and that information theoretic measures are better suited to characterizing VSTM.

[an error occurred while processing this directive]
Spontaneous entrainment to auditory rhythms in vocal-learning bird species
Musical behavior consists of certain core phenomena, including the human tendency to entrain movement to an external auditory pulse. This ability is fundamental both for music production and for coordinated dance. The current literature claims that entrainment is a uniquely human capacity; here we correct this misconception, and report spontaneous motor entrainment to complex auditory stimuli in multiple avian species. In response to novel, natural human music, and in the absence of human movement, multiple avian subjects display significant rhythmic movement at the period and phase of the music across a range of different tempos. Since these animals entrain to music but do not do so in their natural behavioral repertoire, entrainment must have evolved as a byproduct of other cognitive mechanisms. All of our experimental subjects are proficient vocal learners, and thus these data support the idea that vocal learning mechanisms provide the necessary substrate for the entrainment capacity to emerge (Patel, 2006). To further investigate this hypothesis, we examined a corpus of videos on a large video database, and found numerous videos of birds moving at the period and phase of the music across numerous tempos and species. Critically, 100% of these videos involved vocal-learning birds; we found none such videos for vocal non-learning species. We also tested 16 cotton-top tamarin monkeys for spontaneous entrainment using the same stimuli that induced entrainment in avian subjects, as well as simple click-tracks, and found no evidence of entrainment. Together, these data strongly support the idea that entrainment is not unique to humans, and can emerge as a byproduct of other mechanisms, particularly the mechanism for vocal learning.
[an error occurred while processing this directive]
How big is visual long-term memory? Evidence for massive and high fidelity storage
Although people can remember a massive number of pictures (Standing, 1973) , the fidelity with which human memory can represent such a large number of items has not been tested. We conducted three large-scale memory experiments (2500+ objects or scenes) and systematically varied the amount of detail required to succeed in subsequent memory tests. Contrary to the commonly accepted view that natural image representations contain only the gist of what was seen, our results show that human memory is able to store an incredibly large amount of visual images with a large amount of visual detail per item: for instance, observers remembered 87% of images with enough detail to distinguish an object they had viewed from the same object in a different state or pose (for example, a coffee cup that was half empty versus the same cup full) . These results present a challenge to neural and computational models of object and natural image recognition, which must be able to account for such a large and detailed storage capacity.
[an error occurred while processing this directive]
Trial-by-trial variance in visual working memory capacity estimates as a window into the architecture of working memory.
Nearly all models of visual working memory have focused on fitting average memory performance across many simple displays of colored squares. Unfortunately, fitting average performance can mask an underlying failure of the model to account for performance by particular observers in particular displays. We had N=475 observers perform exactly the same set of continuous color report trials (e.g.,Zhang & Luck, 2008), and found a significant amount of observer-to-observer consistency in which displays are hardest and easiest to remember items from. While existing models of visual working memory fail to explain this variance, we present an expanded model that accounts for perceptual grouping, the encoding of summary statistics, and attentional selection, and show that such a model considerably improves the fit to the data. We suggest that in order to understand the architecture and capacity of visual working memory, models must take into account differences in the information observers encode about each particular display.
[an error occurred while processing this directive]
Learning statistical regularities speeds the encoding of information into working memory.
Observes automatically learn statistical regularities in their environment, and use these regularities to form more efficient working memory representations (Brady, Konkle, Alvarez, 2009). For instance, when colors are more likely to appear in certain pairs (e.g., red with blue), observers learn these regularities and over the course of learning are able to remember nearly twice as many colors. Here we investigated whether the benefits of learning hold only at the level of memory storage, or whether perceptual encoding of learned pairs becomes more efficient as well. During the learning phase, 8 colors were presented (four bi- colored objects). After a delay, one location was cued with a thick black outline, and observers reported the color that was presented at the cued location. The colors were paired such that 80% of the time certain colors co-occurred (e.g., red with blue). Over the course of 10 blocks of 60 trials, the number of colors observers could remember doubled from 3 to 6, indicating that observers learned the regularities and formed more efficient memory representations. Next, participants completed a rapid perception task. On each trial, a single color pair was briefly presented, followed by a mask, and then participants reported both colors. At brief presentation times performance was near chance. As time increased there was a reliable advantage for high probability color pairs over low probability color pairs (~15% accuracy difference at 67ms, p<.05). This difference cannot be explained by differences in storage capacity for high and low probability pairs, because only 2 colors had to be remembered, and there was no difference between conditions at the longest presentation times. Such an encoding-time advantage for high-probability color pairs suggests that participants can actually perceive high probability color pairs more rapidly, and that the compression of learned regularities influences low levels of perceptual processing.
[an error occurred while processing this directive]
Are real-world objects represented as bound units? Independent decay of object details from short-term to long-term memory.
Are all properties of a real-world object stored together in one bound representation, or are different properties stored independently? Object information appears to be bound in short-term memory, but this could be because multiple properties are concurrently encoded from the same objects. Here we use short-term and long-term memory paradigms to examine the independence of different object properties. If object properties decay independently from short-term to long-term memory, then we can infer that object information is not stored in a single integrated representation. In Experiment 1, we showed observers 120 real-world objects that were arbitrarily colored and categorically-distinct. Following presentation of the objects, we used a 2AFC to probe either the objects' color or object state (e.g., open vs. closed) after either a short-delay or long-delay. We found that despite being matched in short-delay performance, arbitrary color information decayed much more rapidly than the more meaningful state changes after a long delay (7% for state versus 13% for color, p<0.05). In Experiment 2, we showed observers a set of categorically-distinct objects that varied in two meaningful dimensions (object exemplar and state) which observers remember equally well on average. This was followed by a 4AFC consisting of two exemplars (one familiar, one novel) each in two states (one familiar, one novel) after either a short-delay or long-delay. After a short delay, observers frequently remember both properties about the same objects, but after a long delay they are more independent (18% decrease in boundedness over time, p<0.05). These data indicate that observers do not form a single bound object representation in memory: instead, conceptually meaningful object properties persist while arbitrary properties like color are forgotten. Even for different object properties that are forgotten at about the same rate, observers tend to forget these properties independently of each other for individual objects.
[an error occurred while processing this directive]
Hierarchical encoding in visual working memory.
When remembering a real-world scene, people encode both detailed information about specific objects and higher-order information like the overall gist of the scene. However, existing formal models of visual working memory capacity (e.g., Cowan’s K) generally assume that people encode individual items but do not represent the higher-order structure of the display. We present a probabilistic model of VWM that generalizes Cowan’s K to encode not only specific items from a display, but also higher-order information. While higher-order information can take many forms, we begin with a simple summary representation: how likely neighboring items are to be the same color. In Experiment 1, we test this model on displays of randomly chosen colored dots (Luck & Vogel, 1997). In Experiment 2, we generalize the model to displays where the dots are purposefully arranged in patterns. In both experiments, 75 observers detected changes in each individual display, which allowed us to calculate d' for a particular change in a particular display (range: d'=0.8-3.8). Results show that observers are highly consistent about which changes are easy or difficult to detect, even in standard colored dot displays (split-half correlations=0.60-0.76). Furthermore, the correlation between observers d' and the model d' is r=0.45 (p<0.01) in the randomly generated displays and r=0.72 (p<0.001) in the patterned displays, suggesting the model’s simple summary representation captures which changes people are likely to detect. By contrast, the simpler model of change detection typically used in calculations of VWM capacity does not predict any reliable differences in difficulty between displays. We conclude that even in simple VWM displays items are not represented independently, and that models of VWM need to be expanded to take into account this non-independence between items before we can usefully make predictions about observers' memory capacity in real-world scenes.
[an error occurred while processing this directive]
Ensemble statistics influence the representation of items in visual working memory.
Influential models of visual working memory treat each item as an independent unit and assume there are no interactions between items. However, even in displays with simple colored circles there are higher-order ensemble statistics that observers can compute quickly and accurately (e.g., Ariely, 2001). An optimal encoding strategy would take these higher-order regularities into account. We examined how a specific ensemble statistic -the mean size of a set of items- influences visual working memory. Observers were presented with 400 individual displays consisting of three red, three blue, and three green circles of varying size. The task was to remember the size of all of the red and blue circles, but to ignore the green circles (we assume that ignoring the green circles requires the target items to be selected by color, Huang, Treisman, Pashler, 2007; Halberda, Sires, Feigenson, 2006). Each display was briefly presented, then disappeared, and then a single circle reappeared in black at the location that a red or blue circle had occupied. Observers used the mouse to resize this new black circle to the size of the red or blue circle they had previously seen. We find evidence that the remembered size of each individual item is biased toward the mean size of the circles of the same color. In Experiment 2, the irrelevant green circles were removed, making it possible to select the red and blue items as a single group, and no bias towards the mean of the color set was observed. Combined, these results suggest that items in visual working memory are not represented in isolation. Instead, observers use constraints from the higher-order ensemble statistics of the set to reduce uncertainty about the size of individual items and thereby encode the items more efficiently.
[an error occurred while processing this directive]
Encoding higher-order structure in visual working memory: A probabilistic model.
When encoding a scene into memory, people store both the overall gist of the scene and detailed information about a few specific objects. Moreover, they use the gist to guide their choice of which specific objects to remember. However, formal models of change detection, like those used to estimate visual working memory capacity, generally assume people represent no higher-order structure about the display and choose which items to encode at random. We present a probabilistic model of change detection that attempts to bridge this gap by formalizing the encoding of both specific items and higher-order information about simple working memory displays. We show that this model successfully predicts change detection performance for individual displays of patterned dots. More generally, we show that it is necessary for the model to encode higher-order structure in order to accurately predict human performance in the change detection task. This work thus confirms and formalizes the role of higher-order structure in visual working memory.
[an error occurred while processing this directive]
Uncovering Human Nature: Evolution of the Mind and Brain
How have music, language, perception and thought been shaped by evolution? Are we really so different from our closest animal relatives? Why do we age, and can anything live forever? We'll take an evolutionary perspective on human nature, asking why people think and live the way they do. We'll start at the basics of evolution, and then explore the implications on life in the modern world, discussing everything from fossils and genes to brains and the origins of language.
[an error occurred while processing this directive]
Subliminal influences on your decisions
Why do people named Penelope prefer Pepsi but people named Chris prefer Coke? Could the temperature of my cup of coffee really affect how much I like you later? In this class we will discuss surprising evidence that people's decisions and desires are affected by very subtle changes in the environment around them, and will ultimately discuss whether the decisions we make are really our own.
[an error occurred while processing this directive]
How The Mind Works: A Tour of Awesome Findings in Psychology
Why are our memories sometimes false, and our perceptions fooled by illusions? When should you prefer not to get paid for your work? Could the temperature of my cup of coffee really affect how much I like you later? How could your decisions, your desires, and even your test scores be affected by subtle changes in your environment? In this class we will discuss surprising and important psychology findings that will change the way you think about experiences in your every day life.
[an error occurred while processing this directive]
Genetic and environmental influences on the mind and brain
What makes us the people we are? For thousands of years, the "nature versus nurture" debate has raged on. We'll see how we can actually measure the effects of heritability and environment, and begin to address how our genes interact with experience to produce mind, brain and behavior. We'll look at cool studies comparing identical and fraternal twins, talk about the evolution of human cognition, ask what innateness and heritability really mean, and discuss the philosophical questions that these issues bring up.
[an error occurred while processing this directive]
What do babies know?
What kinds of knowledge are we born with, and what do we learn as young babies? In this class we will discuss everything from why peek-a-boo works so well to the origins of morality. Do babies know the laws of physics -- and how could we tell if they did? Do babies know that some people are nicer than others? Can they do basic math at only 6 months of age?
[an error occurred while processing this directive]