Usability Evaluation Methods

University Degree Mathematical and Computer Sciences

Human-Computer interaction

Table of Contents

Abstract

Usability is becoming a weighting factor that determines the quality of finished software products (Abran et al., 2003). The increasing number of websites flooding the internet has created the awareness for usability of web applications. According to (Bevan, n.d), an organization only meets its needs when a website meets the needs of its intended users. However, designers most often than not take the end users for granted and assume a software product is easy to use so long as they and their colleagues can use it which is never likely so. (Sharp et al., 2007) identify that, users want interactive products that are effective, efficient, safe, satisfying to use and easy to learn. This is why it is imperative for designers to perform usability evaluation to ensure that software products developed are usable by their intended users and meets their needs.

Introduction

The objective of this paper is achieved with the completion of two main tasks. The first task oversees a brief introduction to the concept of usability. Afterwards, the process of evaluating the usability of software products is discussed and its main approaches are identified. Each approach is then examined together with the criteria for selecting methods. Furthermore, the advantages and disadvantages for one evaluation method from each of the approaches is selected elaborated upon.

The second task involves performing a heuristic usability evaluation on an interactive website using Jakob Nielsen’s heuristics. The process identifies usability problems on the website based on the heuristics that are not conformed to. An analysis of the results is then provided and a conclusion is drawn about the overall usability of the website with regards to the perceived severity of the problem identified.

Part I: Analytical and Empirical Methods for Usability Evaluation

Usability Concept

Usability can be defined as the degree of satisfaction, efficiency and effectiveness to attain specified goals by designated users for having used a product in a particular context (ISO, in (Te'eni et al., 2007). In effect, ISO portrays usability as a means of measuring how easy it is for product interfaces to be used. From Jakob Nielsen’s point of view, usability is a quality attribute that is defined by five components namely; learnability, efficiency, memorability, errors and satisfaction (Nielsen, n.d). Nielsen’s elaboration informs that, the ease of using a product can be judged by assessing the product based on these five components. The entire process of assessing the product for ease of use is what is referred to as usability evaluation. According to (Te'eni et al., 2007) the foremost criteria evaluators look out for during usability evaluation is whether the product meets the user’s needs. Then comes the check for simplicity and how pleased the users of the product are. This exercise helps the developers know where they are faulty so as to fix their faults before products are released and in other situations, to identify areas of improvement.

Evaluating Usability

Considering the role of usability in today’s software development process, several methods have been developed to assist its evaluation. These methods can be categorized under two approaches though other authors such as (Baecker et al., 1995) summarizes them into four groups namely; field strategies, experimental strategies, theoretical strategies and responsive strategies. (Baecker et al., 1995)’s classifications however, can be narrowed down into two main approaches as identified by (Faulkner, 1998) to be analytical and empirical approach.

Analytical approach

Analytical approach is an umbrella classification for a collection of evaluation methods that are performed by experts who formally evaluate the tasks and goals of a software product (Sharp et al., 2007). Heuristic evaluation, usability inspection such as cognitive walkthrough, pluralistic walkthrough etc. and predictive models such as Task Semantic-Syntactic-Lexical (TSSL) model and user model-based analysis are among the many examples of analytical approach (Te'eni et al., 2007).

Empirical Approach

Empirical Approach describes a collection of evaluation methods that require user’s participation during the evaluation process. It consists of analyzing user performance through the collection of data and facts while the user interacts with the system (Te'eni et al., 2007). Data collected from such methods are either quantitative in the case of surveys and questionnaires or qualitative as in lab experiments and field studies. Examples include; usability testing, experimental testing, field studies etc.

Choosing Among Methods

There are a number of factors that influence the choice of usability evaluation methods. According to (Dillon, 2001, Arh and Blazic, 2008), the expected information and the stage in products lifecycle when evaluation occurs, play a leading role in the selection of the method. For example; the type of evaluation method chosen when evaluating the usability of the prototype of a product before it is released will be different from that used when evaluating usability after release for upgrade purposes. This is because when evaluating a prototype, the evaluators who may also be part of the design team, look to identify if user requirements have been correctly interpreted and infused into the design. On the other hand where the product is to be upgraded, only specific parts such as navigation attracts focus and hence its lesser scope compared to the case of evaluating a new product. According to (Sharp et al., 2007) the examples above are classified as formative evaluation which is aimed at ascertaining that the product continues to meet users’ requirements.

There is however the capability of combining methods to achieve different perspectives. This according to (Sharp et al., 2007) gives a broader picture of how well a products design meets the usability needs and user experience goals that were identified. As specified by (Te'eni et al., 2007), evaluation is a continuous process since products need to be evaluated throughout their lifecycle. There is therefore the need for careful consideration when it comes to choosing a method for usability evaluation. The pros and cons of each desired method has to be weighed against resources available so as to enable the selection of a method or methods that will provide the truest estimate and yet suitable for the evaluators standards and resources (Dillon, 2001). The following paragraph covers the selection of one method from each of the two approaches stated and weighs out their advantages and disadvantages.

Usability Testing

Usability testing is a method of evaluating the usability of a product by testing it on the intended users (Faulkner, 1998). The most reliable way to estimate applications usability is to measure users’ performance on a set of pre-defined tasks (Mitchell, 2005). This allows for the examination of how adequate, the product supports the intended users in their work. (Sharp et al., 2007) also identified that, using this method at the concluding stages of design ensures consistency in navigation structure, use of terms and how the system responds to the user. Usability testing can be broken down in to two basic approaches; comparative usability testing which involves testing a systems design or performance by comparing it to an existing similar system and absolute or Explorative usability testing where a new product is tested in isolation usually before release (Faulkner, 1998). The process of usability testing is usually carried out in a laboratory where users are isolated from all forms of external interruptions such as phone calls, talking to colleagues, etc (Mitchell, 2005). The users are then asked to perform a set of tasks with the system being evaluated. Records of the number and kind of errors made are recorded including the speed of completion. The entire process may be recorded on video and deductions made thereafter. Besides fulfilling the tasks, users are also interviewed or presented with a satisfaction questionnaire to gather their views of the system. In some cases as stated by (Dillon, 2001), users are asked to view the recorded video and evaluate their own performance, describing their perceptions and actions in more detail.

This is a preview of the whole essay

Ideally, a usability testing evaluation may involve large number of users with the intention to uncover all defects but this is a likely waste of resource as identified by (Nielsen, 2000). Nielsen however recommends no more than 5 users at a go during a usability testing process. Based on his study (Nielsen, 2000), tests would uncover and fix more errors when run in small samples iteratively. (Turner et al., 2006) adds to Nielsen’s study by arguing that, how much sample size is enough vehemently depends on the type of errors to be uncovered and its likely rate of occurrence. However, (Sauro, 2010) sums up their findings and with further studies, concludes that the 5 users rule for testing only applies to discovering problems and thus isn’t suitable for comparing interfaces or estimating a task completion time or completion rate. Applying usability testing on a product through its lifecycle can be of many advantages, some of which are identified below

Strengths

Generates accurate feedback of problems

The quality of the finished product depends on those who are going to use it. Once it satisfies their usability needs, you can be rest assured your job is well done. The involvement of users in usability testing is the source its richness in feedback. It allows the evaluators to gather insightful feedback on the spot. Evaluators are adequately informed of what the actual problems users will encounter when using the product. Hence, usability testing is a trusted technique for gathering quality feedback that help evaluators improve users’ interactive experience (Blast, n.d).

Estimates usability in a more realistic manner

Unlike other evaluation methods that adopt expert based methods to identify usability issues, usability testing is a real life scenario of users actually using the product. This estimates usability in a more realistic sense instead of experts who assume user roles. The reality nature of this technique measures users’ behavior thereby enabling evaluators to understand and better support users motivation and goals (Blast, n.d).

Can be modified to fit other types of testing

Usability testing does not only look out for interface design issues as usually perceived. As Nielsen’s defines it, a usable system must be easy to learn, work productively with minimum resource wasted, easy to memorize, free from errors and satisfying to users at an acceptable level (Nielsen, n.d). Applying usability testing to evaluate ease of use of a product does not only measure its interface structure, it is capable of being modified to test functions, system integration, unit testing, smoke testing etc. the objective of usability testing is however kept in mind to ensure that every aspect of usability is evaluated (Parekh, n.d).

Easy to apply since it reveals what real tasks the users embark upon.

Usability testing does not require experts to test the product though expert testers would produce outstanding results besides not just identifying problems but solutions as well. Selecting users for usability testing does not require any intrinsic task since any user that falls within the mass of intended users, qualify. This widens the choice of users for evaluation without compromising quality as well as making the method very easy to apply. The capability to choose normal testers as compared to expert testers can prove to be cost effective.

Capable of highlighting difficulties in real life usage.

Developers are not perfect. It is not likely that a product can be developed without difficulties the first time which why there is the need to test and be certain. Usability testing with users’ involvement is capable of uncovering potential bugs and product fallacies which have escaped developers (Parekh, n.d). For example what developers may consider to be a flashy design or creativity might turn out to be a problem for users.

Weaknesses

Despite having numerous advantages there are few disadvantages that can be noted with this method, some of which include:

Testing does not fully comply with real life scenarios

For example the environmental factors that would be present in the testing lab may not be the same as that of what the users may have at their comfort. As McGregor puts it, a mother will not have her two children running around like she would have at home (McGregor, 2008). Furthermore the test procedure may not cover all types of user groups.

Costly

Usability testing is carried out in laboratories. Embarking on a usability testing session requires a preparation of environment. The cost involved in setting up the environment can be considerably enormous. Though very effective, it is rated as one of the most expensive methods of usability evaluation (Kern, 2008). A study which involved the comparison of two evaluation methods against several system found that, the usability testing method is more costly as compared to heuristic evaluation (Milszus, 1999).

The presence of observers may affect the users behavior

Unless users are expert testers, the presence of evaluators that observe users while they test the product is likely to spark some edginess within users which consequently affects user behavior. For example users are likely to commit errors when rushing to complete tasks with the intention of impressing the evaluators. Such inconsistent user behavior is likely to affect data collected.

Usually require a prototype when performed on a new product

Unlike heuristic evaluation that can be employed throughout the development lifecycle of a new product, usability testing can only be carried out with the existence of a working prototype. For example evaluating the analysis and design of a product at the elaboration phase of a product lifecycle with usability testing requires a working prototype to be developed whereas an expert with a heuristic evaluation method can identify usability flaws without the need for a prototype.

Time consuming

One of the factors that influence the choice of evaluation methods is how long it takes for results to be gathered. The laboratory nature of usability testing requires testing to be done one after the other. The sequential form of evaluation takes a long time to complete. Results gathered from the evaluation need to be analyzed afterwards adding more time to the duration of the total evaluation process. A comparison of methods by (Jeffries et al., 1991), in (Dumas and Redish, 1999) pointed out that, a heuristic evaluation session that consisted of four experts took 20 hours to complete whereas that with usability testing on the same system took 200 hours. Wolfgang Milszus also stresses this point in his research as it is seen that an evaluation session that took 19 days with heuristic method lasted 56 days with usability testing method (Milszus, 1999). Beyond all reasonable doubts it can be ascertained that usability testing is more time consuming as compared to other methods

Heuristic Evaluation

It is sometimes difficult finding users fit for usability testing nonetheless the cost and time involved. In such cases, experts are the best available option adopted to produce feedback on the usability of a product. Experts are people who have strong background experience in HCI (Sharp et al., 2007). These experts perform inspections on interactive systems based on a set of guide lines they are provided with and identify problems users would have when using the system. They sometimes play the role of the users to achieve this task. The method of performing inspections on the system is referred to as heuristic evaluation and the guidelines used there in are the heuristics (Sharp et al., 2007, Mitchell, 2005)

According to (Te'eni et al., 2007), heuristic evaluation has become the widely accepted usability inspection method owing to its ease of application, low cost, applicability in early stages of development process and generation of effective evaluation without need for professional evaluators. The technique covers evaluation of user interface elements such as dialog boxes, menus, navigation structure, etc. When conducting the evaluation, a set of heuristics are adopted with relevance to the product being evaluated. The experts are then briefed on what to do having been provided with prepared scripts as a guide. During evaluation, the expert spends time inspecting the product according to the heuristics provided. Several experts take turns evaluating the system and sometimes have to assume users’ role depending on the product and stage in development process (Te'eni et al., 2007). A debriefing session is then conducted in which experts discuss problems discovered, prioritize them and come up with solutions. There are varieties of heuristics to choose from, for example; Nielsen’s ten usability heuristics (Nielsen, 1994b), Norman’s rules from Design of Everyday Things (Norman, 1998), Tognazzini’s sixteen principles (Tognazzini, 1992), Shneiderman’s eight golden rules (Shneiderman and Hochheiser, 2001) etc. The choice of heuristics selected varies among products however, most of these heuristics share similar views but are different in the organization of the way they operate (Miller, 2010). Jakob Nielsen and his colleagues were the pioneers of the heuristic evaluation method which was first developed in 1990 (Sharp et al., 2007). A revised version of this heuristic list can be found at the Useit.com website (Nielsen, 2005).

According to (Sharp et al., 2007),an evaluation session using this method should involve about three to five evaluators as one evaluator is certainly not likely to identify all problems. Adequate evidence provided by (Nielsen, 1994b) proves that, five evaluators are enough to determine 75 percent of usability problems (Te'eni et al., 2007, Sharp et al., 2007). Some advantages of heuristic evaluation are listed below.

Strengths

Very flexible

One of the prime advantages of heuristic evaluation is its capability of being applied at early stages in the design o f a product lifecycle. According to (Nielsen and Molich, 1990), heuristic usability evaluation can be used at the specification stages of a product lifecycle to help determine the choice of design approaches to adopt. Evaluation may occur any stage of a products life cycle e.g. mock-up, prototype, and final product. Applying heuristic evaluations as the first of a two face usability effort can greatly complement the subsequent usability testing, by uncovering obvious errors allowing the later to dig out more intricate usability problems hence yielding effective results (Blandford et al., 2008).

Does not require expert evaluators

A study conducted by (Nielsen and Molich, 1990) that involved four experiments with some non-expert evaluators found out that some evaluators performed better than others. This according to Nielsen could lead to concluding that expert evaluators outperformed non-experts. However, results from the study showed that, non-expert evaluators were capable of finding hard problems where as experts in some cases overlooked easy problems (Nielsen and Molich, 1990). This asserts the fact that heuristic evaluation does not require formal usability training of evaluators.

More Cost Efficient

Unlike usability testing, heuristic evaluation does not require a lab setting nor is there the need for users thereby saving the cost acquiring those resources. According to Nielsen, the cost-benefit analysis of heuristic evaluation can be assed from two dimensions. First, the cost involved in terms of time spent conducting the evaluation and second, the benefits in terms of less development cost for redesign (Nielsen, 1994a). (Dumas and Redish, 1999) also revealed that the calculation of the cost benefit analysis of four methods found heuristic evaluation to yield the most payoffs. Heuristic evaluation being so efficient at gathering evaluation results as compared to other evaluation methods makes it a better choice when time and resources are lacking (Kantner and Rosenbaum, 1997)

More Time Efficient

According to (Kantner and Rosenbaum, 1997), heuristic evaluation turns out to be of more value when there is no time to spare. The process involved in performing heuristic evaluation is what complements its time efficiency. Unlike usability testing where each user has to be monitored while testing the system and afterwards the collective data analyzed, heuristic evaluation only requires experts to review and analyze the system at once, without having to add any additional tasks. Given the number of users that need to be evaluated, each at a time while using the system, as compared to the single expert that assumes users role, heuristic evaluation is by far the most time efficient, compared to usability testing and other evaluation methods (Mack and Nielsen, 1994).

Identifies More Problems

Experts who conduct heuristic evaluation follow a set of accepted heuristics to access the usability of the software product. These evaluators have gained substantial knowledge and experiences from other usability studies and are capable of assuming the role of users in the identification of problems as well as recommend solutions. A study by (Jeffries et al., 1991) which compared the effectiveness of several evaluation methods found heuristic evaluation to be the most effective at uncovering problems having identified 105 problems compared to walkthrough and usability testing that dug out 35 ad 31 problems respectively(Dumas and Redish, 1999).

Weaknesses

Lacks actual user feedback

The criticizing nature of heuristic evaluation that follows the set guidelines provided for evaluating the product produces results that lack actual user responses. This is due to the lack of real users since evaluators rather assume the role of the real user. This however results in the identification of problems that can hardly be identified as usability problems.

Likely to report false problems

Owing to the lack of actual users undertaking the evaluation, problems identified have the likelihood of being falsely presented as usability issues. This is because what the expert might consider a flaw may not relate to the tasks users carry out in actuality when using the system.

Lacks direct suggestions to solving problems

Usability experts may not be able to provide direct solutions to usability problems identified since lack of users involvement in the evaluation omits their behavioral traits that help at fashioning the solution.

Suggested solutions may not cover usability issues

Even though evaluators may suggest solutions to some usability problems, these solution may rarely solve usability issues. This explains the fact that evaluators cannot explicitly assume the role of users in highlighting solutions to the problems.

Non-experts may not identify as much problems as will expert evaluators

Even though heuristic evaluation does not require any formal usability training of evaluators, the use of non-experts is demeaning to the quality of faults identified. Lack of experience and substantial knowledge by non-experts is likely to lead to only a partial evaluation since some of the problems may not be identified.

Part II: Heuristic Evaluation of the Usability of a Website

Case Study: ghanaairports.com.gh

Ghanaairports.com.gh is a knowledge base website that offers its users an enriched source of information about all flight related services in Ghana airports. It serves as an information base where users can interact with the system to enable them access travel and airline information, flight schedules, hotel, cargo and ground services ranging from dining areas and shopping centers. Their service is targeted at tourists that look to discover Ghana in its many forms.

This study evaluates the usability of the ghanaairports.com.gh website using the heuristic evaluation method presented earlier on in this paper. The evaluation is carried out by following set of steps and guidelines that model the evaluation procedure and keeps it on track. These steps are outlined below:

The heuristics for evaluating the website is first identified and adapted to fit the scenario.
The evaluator according to his experience assesses the website and identifies functions and features of the website that violates the criteria of the set heuristics.
The functions and features that violate the heuristics are recorded and grouped under problem elements.
Each problem is given a brief description in terms of severity, the particular heuristic that the problem violates and a brief explanation as to why it does not comply with the heuristic.
Finally, problems identified are summarized into an evaluation outcome and the overall usability of the website is discussed with respect to perceived severity of the problems reported.

Findings

The evaluation of the Ghana airports website using Nielsen’s heuristics produced some fascinating results. Problems identified were classified according to severity in that, a severe problem is likely to make users leave the website and not want to return whereas less severe problems depicted issues that users could put up with. The findings from the evaluation are elaborated in the tables below.

Selected Heuristics

The heuristics applied are adopted from Jakob Nielsen’s ten usability heuristics retrieved from useit.com, accessed November 2010. A summary of the applied heuristics is presented in the appendix.

Problems description

Problem severity, non conformed heuristic and reason for nonconformity

The severity of each problem is classified as being High, Medium or Low. A high severity depicts that the associate problem seriously impairs usability of the website. Medium severity means that problem causes annoyance in usability of the website whereas low severity depicts minor issues such as coloring, layouts etc. The heuristic that the problem pertains to is identified and rated in terms of severity. A reason why the problem is judged not to conformance with the heuristic is also provided.

Conclusion

The evaluation of the ghanaairports.com.gh website using usability heuristic evaluation method found that, very few problems were of major concern whereas an unacceptable lot were of minor issues. It can therefore be concluded that the overall usability of the general website is of average quality. The implementation of standard layouts with navigation menu on the left, organization’s information at the top and content in the middle makes the website easy to navigate and access information. The use of short wave length colors such as blue and green also makes the website appealing though masking text in some cases. However, this cannot be a resting ground for too long as the number of minor issues detected depicts lot of room for improvement. Issues that ranked high severity though very little but have the tendency of making users leave the website and not return. It is advised that high ranking issues be dealt with as soon as possible to prevent loss of traffic to the website.

References

Appendix

Applied Heuristics

The applied heuristics adopted and summarized from Jakob Nielsen’s heuristics.

Visibility of system status:

Inform users through appropriate feedback as to what is happening and within reasonable time.

Match between system and real world:

Provide information in natural and logical order with simple language that users are familiar with ad can understand.

User control and freedom:

Provide users with a clearly marked option to leave any unwanted state without further complications.

Consistency and standards:

Use words, situations, or actions that are not be ambiguous, leaving users wondering or guessing. Website should follow standard conversions.

Error prevention:

Present users with a confirmation option before they commit to any action. Errors should be prevented by checking users’ actions to prevent a problem before it occurs. In an event of an error, users should be informed through good error messages.

Recognition rather than recall:

Make actions, options and objects visible to reduce users’ memory load. Instructions for using features should be visible or easy to retrieve when needed.

Flexibility and efficiency of use:

Be flexible enough to support activities of the experienced user and the novice user. It should be efficient at supporting the experienced user with accelerators and allow users to customize frequent actions.

Aesthetic and minimalist design:

Ensure every aspect of information provided in dialogues is relevant in that particular context.

Help users recognize, diagnose, and recover from errors:

Ensure error messages are precise at indicating problem, displayed in plain language rather than codes and suggest a solution that is useful and purposeful.

Help and documentation:

Ensure the website provides some sort of help and documentation which should be easy to find and be precise at providing steps towards users.

Snap shots of web site

Figure 1: Ghana Airports Homepage retrieve from (http://www.ghanaairports.com.gh/) on November 20

Figure 2: Snapshot of Ghana airlines page with problems identified

Figure 3: Error page on Ghana airways website

Figure 4: Snapshot of results from GTmetrix's analysis of Ghana airways page load time retrieved from on November 20.

Usability Evaluation Methods

Abstract

Introduction

Part I: Analytical and Empirical Methods for Usability Evaluation

Usability Concept

Evaluating Usability

Analytical approach

Empirical Approach

Choosing Among Methods

Usability Testing

This is a preview of the whole essay

Strengths

Weaknesses

Heuristic Evaluation

Strengths

Weaknesses

Part II: Heuristic Evaluation of the Usability of a Website

Findings

Selected Heuristics

Problems description

Problem severity, non conformed heuristic and reason for nonconformity

Conclusion

References

Appendix

Applied Heuristics

Snap shots of web site

Document Details

Related Essays

Building Information Management. In this paper we asked the question how Fa...

Office systems project

Design of HCI

Acceptability of Biometric Security System