IETE Technical Review
Home | About us | Search | Ahead of print | Current Issue | Past Issues | Guidelines | Subscribe | Contact
IETE Technical Review
  Users Online: 20 | Login  Print this page  Email this page Small font size Default font size Increase font size


 
ARTICLE
Year : 2009  |  Volume : 26  |  Issue : 6  |  Page : 402-406 Table of Contents   

Technical Review: Current Issues of Usability Testing


School of Computing Sciences, University of East Anglia, Norwich, United Kingdom

Date of Web Publication21-Nov-2009

Correspondence Address:
Majed Alshamari
School of Computing Sciences, University of East Anglia, Norwich
United Kingdom
Login to access the Email id

DOI: 10.4103/0256-4602.57825

Get Permissions

   Abstract 

System usability can be measured through various methods. One of the more important and widely ­employed techniques is 'usability testing', where asks, number of users, evaluators, and other factors are the main ­elements. This paper reviews usability testing together with current issues that can influence usability testing results, both negatively and positively. It also reviews web usability testing. In addition, in this paper, usability testing in the future is considered in order that improvements may be made.

Keywords: Evaluator, Number of users, Tasks, Web usability, Usability testing, Usability in the future.


How to cite this article:
Alshamari M, Mayhew P. Technical Review: Current Issues of Usability Testing. IETE Tech Rev 2009;26:402-6

How to cite this URL:
Alshamari M, Mayhew P. Technical Review: Current Issues of Usability Testing. IETE Tech Rev [serial online] 2009 [cited 2013 May 22];26:402-6. Available from: http://tr.ietejournals.org/text.asp?2009/26/6/402/57825


   1.Introduction Top


It is clear that usability is one of the most important ­success factors in system quality, in particular for websites. Usability testing requires a number of users to perform a set of pre-identified tasks, usually in a usability laboratory. During this testing, evaluators observe how the users interact with the system and identify the usability issues of the system. This paper reviews usability testing and any related issues that influence efficiency. It explores the impact of the role of the evaluator, number of users, tasks, usability problem report, test environment, usability testing measurement tools and other factors. This review is based on the current literature of usability testing and its elements. It concludes by reviewing web usability testing and its future.


   2.Usability Overview Top


Usability is a multi-dimensional concept, having a number of definitions. One of the primary definitions of usability was proposed by Miller who argued that the usability is 'the ease of use' [1] . The definition was then further developed by the International ­Organization for Standardization (ISO), which defines usability as 'the extent to which the product can be used by specified users to achieve specified goals with effectiveness, efficiency and satisfaction in a specified context of use'. In addition, there are usability attributes, and these are collated from a number of sources as in [Table 1] below. Most of these definitions emphasize efficiency, effectiveness and user satisfaction. The variations and differences between these definitions depend on the system characteristics and attributes.

2.1 Usability Evaluation Methods

Usability evaluations should ensure that the software being assessed has elements such as effectiveness, efficiency and user satisfaction (several methods should be used to examine these). Some usability evaluation methods need real users and others do not [3] . [Table 2] below presents these methods.

2.2 Current Issues of Usability Testing

There are various factors affecting usability testing and its results, such as usability measures, evaluator's role, number of users, tasks, usability problem report, test environment and other factors. These factors are shown in [Figure 1].

2.2.1 Evaluator's Role

The term 'Evaluator Effect' simply refers to the limitation that should be reported among the usability issues that are identified while analysing a user test, and it should be minimized as far as possible by the evaluators [6] . Hertzum and Jacobson explain it as the apparent differences among evaluators in terms of the number of usability problems found or in the assessment of those usability problems [7] . The evaluator's role is a critical issue in usability testing, and several studies have proved that problem detection varies noticeably. Hertzum and Jacobson described the evaluator's role as a potential threat when they conducted a study that involved four evaluators whose analyses were individually videotaped. Surprisingly, only 20% of the 93 detected problems were detected by all evaluators [7] . They argued that the main reason behind these differences was the evaluators' interpretations. They also claimed that evaluators appeared to seek out and prove problems they had already discovered [8] . Evaluators were also criticized for a lack of methodical analysis just after the test while tests results remain fresh [8] . Norgaard and Hornbaek suggested three strategies for solving the evaluator effect problem; the first one is through conducting a detailed data analysis. The second strategy is through discussing with other evaluators the specific problems about which an evaluator is unsure. The third strategy is through having the data analysed by different evaluators [8],[9] .

2.2.2 Users

The number of users has been discussed in a number of researches; Nielsen [10] has suggested that five users are enough to discover 85% of usability problems. Turner and others confirm that the first five users can detect most of the usability problems and each additional user is unlikely to uncover new usability problems [11] . On the other hand, Lindgaard and Chattratichart [3] found that one study revealed that only 35% of all usability problems had been detected where the number of users was five. They also reported that there was another study that had discovered only 55% of the usability problems [3],[12] , where the number of users again was five. A comparative study was carried out by Rolf Molich, where nine teams were formed for evaluating the Hotmail website. The top team found only 75% of all of the usability problems that had been found by all the teams together [13] .

Five users was criticized by [14] and [11] as it does not take individual differences into consideration. In this regard, usability tests should classify users in terms of their level of systems experience. Previous studies, such as Nielsen and others [15] have shown that a single user will not come across all problems in a user interface. Furthermore, it can now be concluded that if the website has different types of users, it is vital to consider user numbers and their characteristics seriously.

2.2.3 Tasks

The usability testing tasks themselves should simply refer to what users do to achieve a goal but they are an important issue and can heavily influence a usability evaluation. Wilson describes selecting tasks as a critical activity in usability testing [16] . In one instance, choosing the wrong set of tasks led to hundreds of complaints when the tasks were chosen to test only the appearance of the website [16] . Wilson also highlighted a different case of a failure in task selection when a usability group designed tasks around real user observations. No critical usability problems were detected while they conducted the usability tests. Then they installed the software, and shortly after, customers found that the software was not usable due to performance problems. They realized that the usability testing tasks had been based on 50000 rows of data but many customers had been trying to utilize 10 million rows of data in their own databases [16] .

In this regard, Lindgaard and Chattratichart [3] ­suggested moving the focus from the users to the tasks. They found that there is a significant correlation between the number of user tasks performed by each usability team and the new problems found. They suggested that task designs, number of tasks and task coverage should all have been researched more because of their role in usability tests. Hertzum and Jacobson [7] claimed that there is no guidance for selecting tasks, and that this can influence the evaluator's role, in terms of problem detection, and therefore the usability problem results. In this regard, a study showed that different types of task can seriously influence usability testing [5] . They discovered that different types of task design can reveal different types of problems, e.g. problem solving tasks were able to uncover major usability problems where as structured tasks seemed to reveal minor and superficial usability problems [5] .

There are a number of criteria for selecting and ­designing tasks such as task frequency and criticality. The former refers to tasks that are performed regularly by users, where the latter refers to the impact of tasks on system activity success [16] . There are also other factors that affect task selection, such as: Task generality, first impressions, tasks involving new features, edge-case tasks and tasks that the client/product team is worried about. Task generality refers to tasks that should be common; they help to generalize the usability findings after finishing the usability test. First impression refers to tasks that can measure user feelings at the first moment. Tasks involving new features refer to tasks that can help in measuring the impact of any new features in a system. Edge-case tasks can indicate usability issues with large databases and other system usability aspects [16] .

2.2.4 Usability Problem Report

Conducting a usability test should generate a usability problem report that can effectively help designers and developers to make their decision with regard to the redesign stage. Hornbaek and Frokjaer [17] mentioned that producing a list of usability problems may not effectively help in practical systems development. They also asserted that problem descriptions should be brief and should also describe how to deal with and treat certain problems.

2.2.5 Test Environment

Usability testing normally takes place in a controlled laboratory. Wolf and others [18] criticized conducting HCI experiments in a laboratory for two main reasons; little design guidance is offered due to the cost ­implications for the laboratory. Tullis and others confirmed this ­reason [19] , and the second reason is that conducting tests in a laboratory casts doubt over generalizing the experimental results. They claimed that the laboratory experiment is only one of several methods for collecting empirical data related to usability. In fact, users surf and perform tasks under a number of daily circumstances such as workplace conditions, children's noise and other natural factors. There is a significant lack of research that treats or discusses the influence of the test environment upon usability testing, although Wichansky [20] did suggest that usability testing should be conducted in more natural places.

2.2.6 Usability Measures and Prioritizing Problems

2.2.6.1 Usability measure

Prior to conducting a usability test, testers should be aware of what to test and measure. There are three major ISO standards for measuring usability, which are efficiency, effectiveness and user satisfaction, but this model has been criticized because it is too abstract [2] . In the meantime, McCall's model [2] broke down usability into three criteria: Operability, training and effectiveness. Nevertheless, there remains the difficulty of applying usability standards to a system and deciding exactly how to measure the usability of a particular application.

Hornbaek [21] reviewed current practice in how usability is measured and how problems are related to usability measurements. His review included 180 studies ­published in core HCI journals and proceedings. He eventually came up with the finding that some studies are weakened in measuring usability because of the difficulties in choosing how to measure a system's usability, what elements should be measured and which ways are the most appropriate to measure it. He described usability measures individually such as effectiveness, efficiency and satisfaction. Effectiveness measures can be worked out through measuring binary task completion, accuracy, completeness, quality of outcome and other factors. Efficiency can be measured through measuring input rates, task completion time, mental effort, learning time, use frequency and other factors. He also summarized satisfaction measures including standard questionnaires, preferences, satisfaction with the interface and others. However, the review was based on a broad usability conception, and it neither described nor suggested what to measure for specific systems such as web-based systems, although the study did conclude that there is a need for more valid and complete usability measures. However, Sauro and Kindlund [22] proposed a method for standardizing usability metrics into a single usability metric. These tests propose a quantitative model for usability [Figure 2]. They show that the usability aspects are correlated and equally weighted.

Nielsen [23] recommended using a simplified usability measure, success rate, which he defined as the percentage of tasks that users complete successfully. It divides task completion into three groups: Success, partial and fail [23] . How to choose and which to select as usability measures is a difficult task, especially so as recent studies have offered more than 54 usability measures [24] . However, this also affirms the importance of studying the relationships between usability measures, besides what to measure for Internet websites and how. However, recent research seems to combine usability measures [21] .

2.2.6.2 Prioritizing problems

Prior to judging a problem, a definition of the usability problem should be identified. Each issue that prevents or thwarts users from completing a task can be defined as a usability issue. For example, a hidden log-in link, visual noise, a dead end or an invisible button [25] . In this regard, a severity assessment then takes place after collating all the data needed for analysis. Three factors play vital roles in prioritizing usability problems and in judging their severity. These factors are: Impact, which refers to the amount of trouble a problem makes for the users; persistence, which indicates how many times a problem is encountered by the users and the third factor is frequency, which means the number of users who face a problem [26] .

2.2.7 Other Usability Testing Issues

Hornbaek and Stage [24] discussed four challenges that can improve the interaction between usability evaluation and the design stage. The first challenge is related to whether the software type is a prototype or a running system. They stated that prototype systems may misrepresent real systems' functionality. The second challenge is that insufficient effort is usually allocated to describing or choosing tasks. The third challenge is that of usability problem reports. The fourth challenge is that usability problem reports neither recommend nor suggest problem priority or severity.


   3.Web Usability Testing Top


Lucca and Fasolino [27] reported that testing web ­usability appears to be more difficult than testing traditional systems. This is for two main reasons; the first is that web users are located all over the world but they access it concurrently, and secondly, different types of hardware and software are used in order to access the web [27] . The usability of web-based systems has a great impact on these users on a daily basis. Users are unlikely to revisit a website, if they encounter difficulties in using the system, in particular, where alternative websites are available [28] . Fifty percent of potential sales were lost because of poor web-based systems usability, and difficulty of use was the reason given for the failure of 40% of shop transactions, according to [29] . Therefore, the usability of web-based systems is critical in determining the success of those systems. Many organizations have now realized the importance of website usability after having ignored it because they did not have website usability objective criteria [28] . Levi and Conrad [30] pointed out a fundamental challenge, which is how to recognize a website's limitations prior to releasing it; this could reduce maintenance costs.

Dicks raised four different aspects as the main limitations of usability testing; the first is that testing is always an artificial situation, which lacks realistic circumstances; the second is the inability of the test results to verify that a system works; the third is that participants do not fully represent the targeted website audiences; and the fourth is that testing is not always the best technique to apply [31] .


   4.Usability Testing in the Future Top


Wichansky [20] concluded that 'quick and clean' ­usability testing methods are needed, and that such methods should offer more valid and reliable data. He then suggested that usability testing should be conducted in more naturalistic environments such as simulated homes or classrooms. He also suggested that both ­usability problem reports and testing methods ­specifically tailored for industry should receive more research due to their importance, as well as the testing of mobile phones and handheld devices. In this regard, there is a growing demand for conducting usability tests in a short time with few resources and on a low budget [32] . Therefore, inspection methods are usually excluded as they cost a great deal because of the need to hire experts. Discount Usability Engineering is often more desirable due to its lower cost and less time needed, and also because it is based on three techniques: Scenarios, simplified ­thinking-aloud and heuristic evaluation.

It is clear that usability tests suffer from a number of ­drawbacks as mentioned above. Work on improving usability test conditions, such as tasks, user number, test environment and evaluator role is important and can contribute effectively in the area of usability evaluation methods.


   5.Conclusions Top


This paper reveals how the usability factors can ­influence the usability testing results. Although the evaluator role, tasks, number of users and usability measure have been researched, more research is still needed in order to improve usability testing results. More research is also needed to investigate the role of tasks in dynamic websites; the literature offers a number of interesting results. Number of users is still a controversial area where the literature supports both sides of the argument on the magic number of "five users". Usability testing environment seems not to have been touched upon enough and should receive more research.


   Authors Top



Majed Alshamari was born in Saudi Arabia. He received the B.Sc. and MSc. degrees in Information Systems from King Faisal University in Saudi Arabia and University of East Anglia in the UK, respectively. He performed as an instructor in King Faisal University, Saudi Arabia. He has authored a number of conferences papers and posters. His recent publications are concentrating on usability testing process and its issues in order to improve its efficiency. He is also a member of several professional organizations such as ACM, IEEE, BCS, UPA and SCS. He is now studying PhD in the field of usability testing.


Pam Mahyew was born in the UK. She received her PhD in Information Systems from University of East Anglia, UK. Her current work is focused in three areas: The usability of web interfaces; the successful introduction of IT and global software outsourcing.

Building on work already carried out and published in the areas of e-readiness, e-government and website accessibility, her current work in this area is focusing on the usability testing process itself. The contention is that these tests typically overstate the actual quality of web-based interfaces.

One of her long standing interests has been in assuring the successful introduction of IT into businesses. This builds on work carried out previously in the areas of stakeholder evaluation, critical success factors, senior management influences and quality management practices within IT organizations. The intention is to provide guidance as to the best methods with which to achieve IT's desired benefits. The last of her current areas of research is global software outsourcing. Whilst a very topical and emotive subject it lacks detailed study. The main focus has been on the UK and India, but she is attempting to broaden it to include China with some success.



 
   References Top

1.X. Faulkner. Usability Engineering, Macmillan Press Ltd, 2000.  Back to cited text no. 1      
2.A. Seffah, M. Donyaee, R. Kline, and H. Padda. "Usability measurement and metrics: A consolidated model". Software Quality Vol. 14, pp.159-178, 2006.  Back to cited text no. 2      
3.G. Lindgaard, and J. Chattratichart. "Usability Testing: What Have We Overlooked?", CHI 2007 Proceeding. San Jose, CA, USA: ACM Press, pp. 1415-24, 2007.  Back to cited text no. 3      
4.P. Zaphiris, and S. Kurniawan. Human Computer Interaction Research in Web Design and Evaluation, Idea Group Publishing, 2006.  Back to cited text no. 4      
5.M. Alshamari, and P. Mayhew, "Task Design: Its Impact on Usability Testing", The Third International Conference on Internet and Web Applications and Services, IEEE, Athens, Greece, pp. 583-9, 2008.  Back to cited text no. 5      
6.A. Vermeeren, I. Kesteren, and M. Bekker. "Managing the ′Evaluator Effect′ in User Testing", INTERACT, IOS Press, pp. 647-654, 2003.  Back to cited text no. 6      
7.M. Hertzum, and N.E. Jacobsen. "The Evaluator Effect: A Chilling Fact About Usability Evaluation Methods". International Journal of Human-Computer Interaction Vol. 15. pp. 183-204, 2003.  Back to cited text no. 7  [PUBMED]  [FULLTEXT]  
8.M. Nψrgaard, and K. Hornbζk. "What do usability evaluators do in practice?: an explorative study of think-aloud testing", Symposium on Designing Interactive Systems, pp. 209 - 218. 2006.  Back to cited text no. 8      
9.W. Barendregt, and M. Bekker. "Managing the evaluator effect in the analysis of video data of children′s computer games", Proceedings of the BCS-HCI, People and Computers XVIII - Design for Life, Leeds, UK, 2004.  Back to cited text no. 9      
10.J. Nielsen. Why You Only Need to Test with 5 Users, Available from: http://www.useit.com, 2000.  Back to cited text no. 10      
11.C. Turner, J. Nielsen, and J.Lewis, "Determining Usability Test Sample Size". International Encyclopedia of Ergonmics and Human Factors Vol. 3. pp, 3084-8, 2006.  Back to cited text no. 11      
12.L. Faulkner, "Beyond the five-user assumption: Benefits of increased sample sizes in usability testing", Behavior Research Methods, Instruments, and Computers, Psychonomic Society Publications, pp. 379-83, 2003.  Back to cited text no. 12      
13.R. Molich, M. Ede, K. Kaasgaard, and B. Karyukin. "Comparative Usability Evaluation". Behaviour and Information Technology Vol,23. pp, 65-74, 2004.  Back to cited text no. 13      
14.A. Woolrych, and G. Cockton. "Why and when five test users aren′t enough", IHM-HCI 2001 Conference, Toulouse, France, pp. 105-8, 2001.  Back to cited text no. 14      
15.J. Nielsen, M. Hertzum, and B. John. "The evaluator effect in usability studies: Problem detection and severity judgements", HFES pp. 1336-40. 1998.  Back to cited text no. 15      
16.C. Wilson, "Taking usability practitioners to task". Interactions Vol. 14, pp.48-49, 2007.  Back to cited text no. 16      
17.K. Hornbaek, and E. Frokjaer, "Comparing Usability Problems and Redesign Proposals as Input to Practical Systems Development". CHI 2005 (2005).  Back to cited text no. 17      
18.C. Wolf, J. Carroll, T. Landauer, B. John, and J. Whiteside, "The role of laboratory experiments in HCI: help, hindrance, or ho-hum?", ACM SIGCHI Bulletin ACM New York, NY, USA: pp. 265-8. 1989.   Back to cited text no. 18      
19.T. Tullis, S. Fleischman, M. McNulty, C. Cianchette, and M. Bergel. "An Empirical Comparison of Lab and Remote Usability Testing of Web Sites", Usability Professionals Association Conference, 2002.  Back to cited text no. 19      
20.A. Wichansky. "Usability testing in 2000 and beyond". Ergonomics, Vol, 43 pp, 998-1006, 2000.  Back to cited text no. 20      
21.K. Hornbζk. "Current practice in measuring usability: Challenges to usability studies and research". Int. J. Hum.-Comput. Stud. Vol, 64.pp,79-102,2006.  Back to cited text no. 21      
22.J. Sauro, and E. Kindlund. "A Method to Standardize Usability Metrics Into a Single Score", Conference on Human Factors in Computing Systems, Portland, Oregon, USA, 2005, pp. 401 - 409.  Back to cited text no. 22      
23.J. Nielsen, Success Rate: The Simplest Usability Metric. Available from: http://www.useit.com, 2001.  Back to cited text no. 23      
24.K. Hornbaek, and J. Stage. "The Interplay Between Usability Evaluation and User Interaction Design". International Journal of Human-Computer Interaction Vol, 21. pp, 117-123, 2006.  Back to cited text no. 24      
25.J. Artim. Usability Problem Severity Ratings, Available from: http://primaryview.org, 2003.  Back to cited text no. 25      
26.M. Hertzum. "Problem Prioritization in usability Evalution: From severity assessments toward impact on design". International Journal of Human-Computer Interaction Vol. 21, pp. 125-146, 2006.  Back to cited text no. 26      
27.G. Di Lucca, and A. Fasolino, "Testing Web-based applications: The state of the art and future trends". Information and Software Technology Vol.48. pp. 1172-86, 2006.  Back to cited text no. 27      
28.C. Osterbauer, M. Kφhle, T. Grechenig, and M. Tscheligi, "Web Usability Testing: A case study of usability testing of chosen sites (banks, daily newspapers, insurances)", the Sixth Australian World Wide Web Conference, 2000.  Back to cited text no. 28      
29.R. Sherry and Y. Chen, "The assessment of usability of electronic shopping: A heuristic evaluation". Internationa journal of Information Management Vol. 25, pp. 516-32, 2005.  Back to cited text no. 29      
30.M. Levi, and F. Conrad, Usability testing of World Wide Web sites: a CHI 97 workshop. SIGCHI Bull Vol. 29 40-3. 1997.  Back to cited text no. 30      
31.R. Stanley Dicks, "Mis-usability: on the uses and misuses of usability testing", ACM Special Interest Group for Design of Communication, ACM Press, Toronto, Ontario, Canada, pp. 26 - 30. 2002.  Back to cited text no. 31      
32.A. Anandhan, S. Dhandapani, H. Reza, and K. Namasivayam. "Web usability testing - CARE methodology", the Third conference on Information Technology: New Generations (ITNG′06), IEEE, pp. 450-500.  Back to cited text no. 32      


    Figures

  [Figure 1], [Figure 2]
 
 
    Tables

  [Table 1], [Table 2]


This article has been cited by
1 Eye tracking and universal access: Three applications and practical examples
Bartels, M., Marshall, S.P.
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). 2011; 6766: 525-534
[Pubmed]
2 Usability evaluation as part of iterative design of an in-vehicle information system
Mitsopoulos-Rubens, E., Trotter, M.J., Lenné, M.G.
IET Intelligent Transport Systems. 2011; 5(2): 112-119
[Pubmed]
3 Usability Testing for e-Resource Discovery: How Students Find and Choose e-Resources Using Library Web Sites
Fry, A., Rich, L.
Journal of Academic Librarianship. 2010; 37(5): 386-401
[Pubmed]



 

 
Top

    

 
  Search
 
  
    Access Statistics
    Email Alert *
    Add to My List *
* Registration required (free)  

 
  In this article
    Abstract
    1.Introduction
    2.Usability Overview
    3.Web Usability ...
    4.Usability Test...
    5.Conclusions
    Authors
    References
    Article Figures
    Article Tables

 Article Access Statistics
    Viewed2728    
    Printed238    
    Emailed1    
    PDF Downloaded325    
    Comments [Add]    
    Cited by others 3    

Recommend this journal