The term "usability" can mean a variety of things. At GitLab, we use the following definition of usability when we conduct user research and design our product:
To be usable, an interactive system should be useful, efficient, effective, satisfying, and learnable.
Usefulness is the degree to which our product enables a user to achieve his or her goals.
Efficiency is the quickness with which the user’s goal can be accomplished accurately and completely in a period of time.
Effectiveness refers to the extent to which the interactive system behaves in the way that users expect it to and the ease with which users can use it to do what they intend.
Learnability is a part of effectiveness and has to do with the user’s ability to operate the interactive system to some defined level of competence after some predetermined amount and period of training (which may be no time at all). It can also refer to the ability of infrequent users to relearn the system after periods of inactivity.
Satisfaction refers to the user’s perceptions, feelings, and opinions of the product.
Note: This definition is based on information from the Handbook of Usability Testing and the International Usability and UX Qualification Board curriculum.
Usability testing is the process of evaluating a product experience with representative users. The aim is to observe how users complete a set of tasks and to understand any problems they encounter. Since users often perform tasks differently than expected, this qualitative method helps to uncover why users perform tasks the way that they do, including understanding their motivations and needs. At GitLab, usability testing is part of solution validation.
We also conduct regular Usability Benchmarking studies at GitLab. These are also focused on usability, and are used to set performance and ux benchmarks for specific tasks and workflows across GitLab. As such, they are much more rigorous and time-consuming than a normal usability test ought to be.
Generally speaking, we can differentiate between:
Moderated versus unmoderated usability testing
Moderated tests have a moderator present who guides participants through the tasks. This allows them to have a conversation about their experience, and it helps to find answers to “Why?” questions.
Conversely, users complete unmoderated usability tests on their own without the presence of a moderator. This is helpful when you have a very direct question.
Moderated usability testing | Unmoderated usability testing |
---|---|
Complex questions/WHY? Questions, such as: “Why do users experience a problem when using a feature?” “Why are users not successful in using a feature?” “How do users go about using a feature?” |
Direct questions, such as: “Do users find the entry point to complete their task?" “Do users recognize the new UI element?” “Which design option do users prefer?” |
Formative versus summative usability testing
Formative usability tests are continuous evaluations to discover usability problems. They tend to have a smaller scope, such as focusing on one specific feature of a product or just parts of it. Oftentimes, prototypes are used for evaluation.
Summative usability tests tend to be larger in scope and are run with the live product. They are useful if you lack details on why users are experiencing a particular problem or want to validate in the first place how easy it is to use.
To learn more about these factors, see our definition of usability.
Usability testing happens before you conduct a CM Scorecard. Usability testing helps to identify the majority of problems users encounter, the reasons for these issues, and their impact when using GitLab, so we know what to improve.
If you run a CM Scorecard without prior usability testing, you will likely identify some of the usability problems users experience. However, the effort and rigor connected with the Category Maturity Scorecard to measure objectively the current maturity of the product doesn’t justify skipping usability testing. In addition, during usability testing you have opportunities to engage with participants directly and dive deeper into understanding their behaviors and problems. The Category Maturity Scorecard process does not allow for such interactions as it's designed to capture unbiased metrics and self-reported user sentiment.
Why should I not ask in user testing if a participant completed a task successfully? It’s a self-reported measure which may or may not be true. If a participant indicated they completed a task successfully, you still need to check if they really completed it based on how you defined success. It’s important to capture only necessary data from users and considering you need to double-check their response with actual behaviour, we suggest to leave it out in the first place.
Why should I not use UserTesting.com’s native task metric difficulty to assess Efficiency? Isn’t it the same? UserTesting.com’s task metric difficulty is similar but it uses a different rating scale than the Single Ease Question. Currently, there is no option to change the scale labels in UserTesting.com to align with ours.