When the System Usability Scale Actually Works
Using SUS to evaluate a retail POS system taught me that the right metric at the right time gives you leverage, especially when users have a shared baseline for comparison.
I don't reach for the System Usability Scale often, but when I do, it's because the conditions are right. Back in 2015, I was leading the design of a new POS system for retail stores, and SUS gave us exactly what we needed: a way to track usability at scale while the team learned the new system.
What made this work
The System Usability Scale is a ten-question survey that gives you a score out of 100 for perceived usability. It's quick, it's validated, and it's been around since 1986. John Brooke created it at Digital Equipment Corporation, and it's held up remarkably well. You can read the full methodology on usability.gov or dig into Jeff Sauro's research on interpreting SUS scores.
The ten questions alternate between positive and negative statements, all rated on a five-point scale from "Strongly Disagree" to "Strongly Agree":
| # | Question |
|---|---|
| 1 | I think that I would like to use this system frequently |
| 2 | I found the system unnecessarily complex |
| 3 | I thought the system was easy to use |
| 4 | I think that I would need the support of a technical person to be able to use this system |
| 5 | I found the various functions in this system were well integrated |
| 6 | I thought there was too much inconsistency in this system |
| 7 | I would imagine that most people would learn to use this system very quickly |
| 8 | I found the system very cumbersome to use |
| 9 | I felt very confident using the system |
| 10 | I needed to learn a lot of things before I could get going with this system |
But knowing what SUS is doesn't tell you when to use it. I think it worked for us because we had something most research projects don't: a closed comparison.
The old system was a dated green-and-black screen terminal with no visual content. The kind of interface where you memorise codes and keyboard shortcuts because there's nothing else to guide you. Our teams had used it for years. They knew exactly what frustrated them about it, and they were motivated to help us improve things.
When they filled out the SUS questionnaire, they weren't comparing our new system to some abstract ideal. They were comparing it to what they used every day. That context made the scores meaningful in a way I haven't seen since.
What we learned from the curve
We ran SUS surveys with 20 to 30 POS users after training sessions throughout the project. People filled them out individually, which let us track how usability changed as the team learned the new workflows.
The scores went up over time, which you'd expect. But the pattern of outliers told us more than the averages. When someone scored the system significantly lower than their peers, we could follow up directly. Those conversations surfaced insights we wouldn't have found otherwise.
One thing that emerged clearly: power users needed keyboard shortcuts. POS work is repetitive, and speed matters. The visual interface helped new users get oriented, but experienced staff wanted to fly through transactions without touching the mouse. We hadn't prioritised shortcuts early enough, and the SUS outliers flagged that gap before we rolled the system out to stores.
When I'd use SUS again
This project had specific conditions that made SUS valuable. We had a consistent user group with shared context. We were tracking usability over time as people learned the system. And we had the capacity to follow up on outliers, which turned quantitative scores into actionable insights.
I'd reach for SUS again if I'm running research at scale where I expect varied responses. The standardised format makes it easy to spot patterns and outliers across a larger sample. But I wouldn't use it as a standalone metric. The score is useful, but the conversations it prompts are what actually improve the design.
If you're working with a small group, or your users don't have a shared baseline for comparison, I think you're better off with qualitative methods. SUS shines when you need to make sense of feedback from dozens of people, not when you're having detailed conversations with five.
The real value
What I took from this project wasn't that SUS is a magic tool. It's that the right metric at the right time gives you leverage. We caught adoption issues before launch because the survey created a feedback loop we could act on. The teams felt heard, we improved the system, and everyone was better prepared when we rolled it out.
That closed ecosystem—users with a shared baseline, clear motivation to improve things, and a timeline that let us iterate—made SUS work. Without those conditions, I'd probably choose something else.