Abstract
This study examined whether the average usability score for a series of tasks was the same as the usability score for the product if usability was measured only after all the tasks had been completed. Fifty participants completed a set of tasks for five websites and fourteen mock voting ballots. Subjective usability assessment was made with the System Usability Scale (SUS). Participants completed the SUS either after each task (five or fourteen SUS administrations, respectively) or after completing the entire set of tasks (one SUS). The results show that the average SUS scores for the task-level assessments were significantly higher than the SUS scores for the test-level assessments. Results were similar for the ballot and website conditions. Task-level SUS scores on the Honda websites (M = 65.5) were significantly higher than the test-level SUS scores (M = 42.8), p < 0.0001. Similar results were observed in the ballot condition, where task-level usability assessments were higher (M = 59.5) than test-level assessments (M = 38.5), p < 0.0001. Practitioners and those interpreting SUS scores need to be aware of how these experimental differences can lead to different assessment metrics.
Get full access to this article
View all access options for this article.
