Why Short-Term Memory Matters
The first post in a series exploring memory and the Simple Model of Teaching
This post is the first in a short series about the Simple Model of Teaching.

As we prepare to launch Steplab across our school this autumn, we’ve been spending time with its Simple Model of Teaching (Figure 1): a clear, structured framework that supports planning, coaching and professional development. It’s designed to help teachers focus on what matters most and it’s a helpful tool. But like any model, it works best when we bring our own knowledge to it…not just of learning, but of learners.
This series is an attempt to do just that. And writing for me is a way of thinking aloud and an opportunity to reflect on how best to implement this work before we roll it out. I am grateful that the summer holidays, in between time with friends, family…and Expedition 33 (!), offers space for that kind of reflection.
So, this series aims to be a companion to the Simple Model of Teaching. A way of thinking alongside it and of filling it out.
We begin with short-term memory: a step that doesn’t appear in the diagram, or the one it’s adapted from, but one that underpins the rest.
It’s where deliberate learning begins.
What is short-term memory?
Short-term memory (STM) refers to the brief, temporary holding of a small amount of information, typically for just a few seconds. It’s limited in both capacity and duration and unless the information is rehearsed, it tends to fade quickly.
Some models treat STM as a separate system; others see it as overlapping with working memory, which I’ll explore in my next post. Either way, the distinction matters. Unlike working memory, which involves actively manipulating information, short-term memory simply holds it in mind. It doesn’t solve the problem; it just keeps the pieces available.
STM is what allows a child to:
remember the first part of your sentence while you’re still speaking the last
copy a diagram from the board
hold the digits of an equation just long enough to write them down
It’s often overlooked, as it has been in the model we’re discussing, but if information isn’t held, it can’t be worked with.
Everyone stores the world differently
When we look out at the world, we don’t see the world.
We perceive a model of the world.
And we all have our own models shaped by experience, expectation, and what we notice.
Helpful when you’re trying to get 30 different kiddos with 30 different models of the world to understand the same 1 model of tectonic plates.
Because short-term memory doesn’t store objective reality. It stores what was attended to, understood, or rehearsed. In that sense, it’s as much about perception as it is about memory.
And perception isn’t neutral.
Two children hear the same instruction. One acts. One hesitates. It’s not always about motivation or comprehension. It might be about what was stored, and how.
And that varies.
What if the classroom feels unsafe?
There’s an image I keep coming back to: a comparison of eye-tracking scans between a non-artist and an artist viewing the same scene. The difference is striking. The artist’s gaze moves widely, taking in the periphery, the shape of negative space. The non-artist’s gaze clusters tightly around the central subject. See Figure 2, below.

Now imagine that same idea in the classroom.
Some children, particularly those who’ve experienced trauma, may not scan the room to take in information. Instead, without even realising, they scan for threat.
Eye-tracking research shows that people with trauma and anxiety are drawn rapidly and automatically toward threat and once their attention lands, it struggles to shift. See Figure 3 below. The system isn’t faulty. It’s protective.

And if attention is pulled toward risk, then short-term memory follows. In cognitive theory, STM typically only stores what has been perceived, and perception is where attention goes. It’s not defiance, or carelessness, or poor listening, or poor memory. It’s the brain doing exactly what it learned to do: watch for danger.
The system’s not failing.
It’s protecting.
STM isn’t measured in a vacuum
In research and assessment contexts, short-term memory is often measured using tasks such as digit span, where a pupil is asked to repeat a string of numbers in order. It’s a simple way to estimate how much verbal information can be held briefly in mind.
It’s relevant: the WISC-V (Wechsler Intelligence Scale for Children) uses digit span to help measure STM.
But like all assessments, digit span isn’t neutral. It reflects not just memory capacity, but also language familiarity, cultural experience, and strategies shaped by context.
This matters, especially when we’re working with pupils from diverse linguistic backgrounds.
It’s well established that digit span varies across languages, and that this variation is partly explained by the spoken length of number words. In some languages, such as Mandarin, number words are shorter and quicker to articulate, than, Welsh, for example, which makes them easier to rehearse in the mind before they fade. The faster we can repeat something to ourselves, the more of it we can keep hold of.
In cognitive terms, that silent repetition is known as subvocal rehearsal, or more informally, the “inner voice”. It’s part of the phonological loop, and it’s one of the main strategies we use to maintain information in verbal short-term memory. It’s probably not very effective as a strategy, but that’s beyond the scope of this post.
But it’s not just about speed.
Research suggests that some language groups also draw on different memory strategies (Baddeley, Xu, Ho, & Hitch, 2023). A striking feature of speakers of Mandarin and Cantonese is the fact that their immediate verbal memory span tends to be substantially greater than is found for other languages. For example, speakers of Mandarin and Cantonese continue to show a phonological similarity effect (the tendency to confuse similar-sounding words) even when silent rehearsal is blocked. While this effect typically disappears under articulatory suppression, its persistence here suggests that they may be drawing on more than just an “inner voice.” They appear to use an “inner ear” as well and maintaining sounds through auditory imagery as well as articulation.
This dual coding (drawing on both articulation and auditory imagery) may reflect the demands of learning a complex writing system. But it also reminds us that verbal memory strategies are not universal. They’re shaped by language, culture, and the ways we’ve learned to process sound and meaning.
For pupils learning English as an Additional Language (EAL), especially those still developing fluency, there is often an additional processing demand at the point of input. Before verbal information can be stored, it first has to be decoded, and when the language of instruction is unfamiliar or still being mapped, that decoding process draws on valuable cognitive resources.
It means that, in practice, working memory is being engaged earlier - just to access the input - before short-term memory can even begin to hold it.
It does mean that tasks which assume instant access to language, whether in assessment or classroom dialogue, can underestimate how much effort is already being expended just to hear, segment, and interpret what’s being said.
Supporting children at this stage means recognising that rehearsal and retention are not always immediate. Clear, paced input and space for repetition aren't just helpful: they make memory (and therefore learning) possible.
Where short-term memory fits
The Simple Model of Teaching doesn’t show short-term memory.
And yet, in practice, it’s everywhere. It sits inside every arrow, every step, every well-chosen question. Before a pupil can work with an idea, they must first hold it, even if just briefly. And that brief holding depends on more than just teaching technique, as this post has begun to explore. There’s even more to it…but that’s for another time.
If we were to make space for short-term memory in the model, I wouldn’t wedge it in as a separate box. I’d thread it through, perhaps between what the model labels as “securing attention” and “optimising communication”.
It’s like the hinge between perception and manipulation.
And, referring to the diagram, when we reduce that moment to a question like “Has the teacher got students’ attention?” we risk flattening it. That phrase makes attention sound simple, as if it is something the teacher either captures or doesn’t. It implies control. Compliance. That everyone is starting from the same place.
But attention is not a switch. It’s variable, fragile, and contingent on emotion, language, sensory access, fatigue, and context. And it certainly doesn’t always move through the eyes, despite what diagrams (or practices such as SLANT) might suggest.
A better question might be:
Has attention been invited, supported, and sustained?
Or:
Are conditions in place for attention to be possible?
These kinds of questions shift the gaze from control to co-regulation.
The same goes for curriculum. “Has the teacher selected the right ideas to teach?” risks sounding hierarchical as if the job is simply to choose, rather than to connect.
We might ask instead:
Do the ideas connect to prior knowledge, lived experience, or future learning?
These are more than semantic tweaks. They shape how we interpret what we see. And they remind us that models are starting points, not prescriptions.
Short-term memory isn’t always something we can observe directly.
But when we think with it in mind, we tend to notice more.
And that often changes what we do next.
If you'd like to keep reading...
The next post will look at working memory. You can subscribe by clicking the button below to receive it directly when it’s published.
Thank you for reading.
References
Armstrong, T., & Olatunji, B. O. (2012). Eye tracking of attention in the affective disorders: A meta-analytic review and synthesis. Clinical Psychology Review, 32(8), 704–723. https://doi.org/10.1016/j.cpr.2012.09.004
Baddeley, A. D., Xu, Z., Ho, S. T., & Hitch, G. J. (2023). On verbal memory span in Chinese speakers: Evidence for employment of an articulation-resistant phonological component. Journal of Memory and Language, 129, 104389. https://doi.org/10.1016/j.jml.2022.104389
Mccrea, P. (2024, January 17). A simple model of teaching [Tweet]. Twitter/X. https://twitter.com/PepsMccrea/status/1747685287633086862 Image adapted from original by Oli Cav.
Vogt, S., & Magnussen, S. (2007). Expertise in pictorial perception: eye-movement patterns and visual memory in artists and laymen. Perception, 36(1), 91–100. https://doi.org/10.1068/p5262

