It's hard to avoid the "acquiescence bias" in an interface - people will say that they like something even after not being able to complete basic tasks. Rating design is subjective and people frequently feel like the are being put under pressure, right or wrong, to rate aesthetics. Tags that represent emotive responses or evoking feedback that is task-centric allows us to better measure actual success. Read on to see some recommendations that have been around for a while that may help with this.
I do feel like this area of usability, especially in the enterprise, is a work in progress. We have likely over-rotated on aesthetics due to mass frustration with archaic interfaces, and may need to re-balance priority against business process, perhaps taking some cues from social activity streams, rather than just focusing on 'small d' design as a way to solve adoption.
Most usability tests culminate with a short questionnaire that asks the participant to rate, usually on a 5- or 7-point scale, various characteristics of the system. Experience shows that participants are reluctant to be critical of a system, no matter how difficult they found the tasks. This article describes a guided interview technique that overcomes this problem based on a word list of over 100 adjectives. — David Travis, March 3, 2008, updated 22 July 2009.
These dimensions of usability come from the International Standard, ISO 9241-11, which defines usability as:
"Extent to which a product can be used by specified users to achieve specified goals with effectiveness, efficiency and satisfaction in a specified context of use."
The ISO definition of usability makes it clear that user satisfaction is just one important dimension of usability. People may be well disposed to a system but fail to complete business-critical tasks with it, or do so in a roundabout way. The three measures of usability — effectiveness, efficiency and satisfaction — are independent (PDF document) and you need to measure all three to get a rounded measure of usability.
In our studies, we notice that participants tend to rate an interface highly on a post-test questionnaire even when they fail to complete many of the tasks. I've spoken to enough of my colleagues at conferences and meetings to know that this problem is commonplace. Is this because we are about to give the participant £75 for taking part in a test session or is there something else at work? For example, one group of researchers makes this point:
"In studies such as this one, we have found subjects reluctant to be critical of designs when they are asked to assign a rating to the design. In our usability tests, we see the same phenomenon even when we encourage subjects to be critical. We speculate that the test subjects feel that giving a low rating to a product gives the impression that they are "negative" people, that the ratings reflect negatively on their ability to use computer-based technology, that some of the blame for a product's poor performance falls on them, or that they don't want to hurt the feelings of the person conducting the test." - Wiklund et al (1992).
Once you ask participants to assign a number to their experience, their experience suddenly becomes better than it actually was. We need some way of controlling this tendency.Read more at www.userfocus.co.uk