When you learned about language sample analysis in grad school, you probably heard that sample length may vary but that you should aim for about fifty utterances. In the “olden days” we were shooting for 15 minutes or 100 utterances; whichever came first. I’ll gloat: our own research was key in establishing the 50 complete, intelligible, and verbal utterance guideline. So, where does that number come from? When is it ok for you to bend the line a bit? And what should you do with those fifty utterances?
We are balancing clinical feasibility on the one hand, with robustness of results on the other. Research suggests that short samples, when properly elicited, yield robust results. So, you could think of fifty utterances as your minimum, depending upon the sample context. Honestly, though, there are a lot of factors that influence language sample elicitation, and sample length isn’t going to compensate for loose elicitation. Fifty utterances are only going to provide robust results if, during that time, you successfully elicit the relevant elements of the target speaker’s language production.
First, the impact of sample length varies by sampling context. For our purposes here, we can break this down into two categories: narrative-dependent and open-ended.
SALT incorporates four narrative-dependent contexts: Speaker Selects the Story, Narrative Story Retell, Persuasion, and Expository. All four require the speaker to tell a complete story, whether that is a story from a picture book, or the full set of rules on how to play a game or sport. It is important that the speaker goes through the entire narrative, and that the entire narrative is included in the analysis, particularly if you are planning on using a SALT reference database for comparison. So, the narrative should be at least fifty utterances, but, depending on your speaker, it may have to be quite a bit longer. Again, it is important that you do not stop mid-task.
In contrast, play and conversation contexts are open-ended; there is no structured beginning, middle, or end. As such, they are more flexible. With play and conversation samples, you should focus on trying to elicit both:
- a sample that reflects the speaker’s typical language production, and
- a sample that challenges the speaker’s abilities and draws out the issues of concern.
Fifty utterances is your minimum target here, as well. But with these open-ended contexts, once you have hit both your language targets and the minimum length, feel free to end the sample.
If, however, the speaker’s language is improving (or changing) during the language sample – perhaps due to improving rapport, or a topic that really sparked production – then, by all means, keep eliciting. In SALT, you always have the option of using the Transcript Cut (T-Cut) feature to analyze the sample. The T-Cut function determines the section of the transcript to include in analysis. For example, maybe the beginning of the sample does not reflect what the speaker can, and typically does, do. With the transcript cut option, you could remove (essentially block) that section of the transcript from the analysis. There are lots of good reasons to play around with the T-Cut. I digress; more on that another day.
A word of caution about these open-ended contexts, though: they are particularly vulnerable to examiner influence. And influence can be interpreted as “interference,” “intrusion,” or “impact,” all things to avoid when taking a language sample. You can find our suggestions for guidelines for successful sample elicitation here next month!
Measure of Concern
Second, the impact of sample length varies by measure. A short language sample, for example, may not provide a complete inventory of the speaker’s language repertoire. Some measures are more affected by length, including but not limited to:
- Number of different words produced (vocabulary diversity);
- Number of errors produced;
- Number of omissions produced (part word or full word);
- Number of repetitions, revisions, and/or filled pauses produced (mazes).
If you suspect that the speaker’s problem areas are reflected in one or more of these measures, try to collect a longer or more comprehensive sample. The longer or more complete the sample, the more opportunity for these linguistic features to show and be captured for analysis and interpretation.
If you aren’t sure what the problem is, if you speculate that there might be multiple issues, or if you are unsure whether there is a problem at all, try to collect a longer and more comprehensive sample. In all of these cases, getting a fuller picture of the speaker’s repertoire is necessary.
So, what does sample length mean for comparing your sample to a SALT database? Again, it depends on the elicitation context! In general, it is best to compare your target speaker’s transcript against transcripts of similar length. This is because, as discussed above, some measures are highly affected by transcript length. SALT has built-in defaults that make the decision a little easier.
With narrative-dependent sampling contexts (narrative, expository, and persuasion), SALT’s default is to base analysis reports on the “entire transcript” since these sampling contexts follow a structured format with an obvious beginning, middle, and end. In these elicitation contexts, speakers use a book or a planning sheet to structure their language. The structure of these sampling contexts is more important than the length. For that reason, you will need to elicit and transcribe the whole sample. However, it’s really worth the time! There are additional analyses that you can apply to your transcript (e.g., subordination index, scoring schemes) that generate so much meaningful data.
With open-ended play and conversation samples, it is best to compare the sample to the database based on length. When using the SALT reference database, you have to make a choice as to how to equate your transcript to the database transcripts and how the subsequent reports are based (e.g., length or entire transcript). If your transcript is lengthy, you may want to implement a transcript cut (T-Cut, as mentioned earlier).
If your open-ended sample is short on utterances, we recommended that you equate on the same number of utterances, and make a note of the short sample length in your documentation.
The Bottom Line
Would a 100-utterance sample be better than a 50-utterance sample? Well, “better” implies a value judgement. 100 utterances would likely give you more robust results. But it would also take longer to elicit and transcribe, making it less feasible for many clinicians. Fifty utterances, properly elicited, will give sufficient information. But that “properly elicited” caveat is really important.
In the end, we are trading off robustness against clinical feasibility. And so, we are right back where we often end up: you need to use your clinical judgement. Did the language sample give you the answers to your clinical questions? Did you elicit enough language to get a full narrative and challenge the speaker’s language system? Did you minimize examiner influence during the sample? If you can answer yes to these questions, then you can feel confident in a shorter sample.
Heilmann, Nockerts, and Miller, J. “Language Sampling: Does the Length of the Transcript Matter,” Language, Speech, and Hearing Services in Schools, vol. 41, iss. 4, October 2010. https://pubs.asha.org/doi/10.1044/0161-1461%282009/09-0023%29
Mollee J. Pezold, Caitlin M. Imgrund and Holly L. Storkel. “Using Computer Programs for Language Sample Analysis,” Language, Speech, and Hearing Services in Schools, vol. 51 iss. 1, January 2020. https://pubs.asha.org/doi/pdf/10.1044/2019_LSHSS-18-0148