All Topics  
Usability testing

 

   Email Print
   Bookmark   Link






 

Usability testing



 
 
Usability testing is a technique used to evaluate a product by testing it on users. This can be seen as an irreplaceable usability
Usability

Usability is a term used to denote the ease with which people can employ a particular tool or other human-made object in order to achieve a particular goal....
 practice, since it gives direct input on how real users use the system. This is in contrast with usability inspection
Usability inspection

Usability inspection is the name for a set of methods where an evaluator inspects a user interface. This is in contrast to usability testing where the usability of the interface is evaluated by testing it on real users....
 methods where experts use different methods to evaluate a user interface without involving users.

Usability testing focuses on measuring a human-made product's capacity to meet its intended purpose.






Discussion
Ask a question about 'Usability testing'
Start a new discussion about 'Usability testing'
Answer questions from other users
Full Discussion Forum



Encyclopedia


Usability testing is a technique used to evaluate a product by testing it on users. This can be seen as an irreplaceable usability
Usability

Usability is a term used to denote the ease with which people can employ a particular tool or other human-made object in order to achieve a particular goal....
 practice, since it gives direct input on how real users use the system. This is in contrast with usability inspection
Usability inspection

Usability inspection is the name for a set of methods where an evaluator inspects a user interface. This is in contrast to usability testing where the usability of the interface is evaluated by testing it on real users....
 methods where experts use different methods to evaluate a user interface without involving users.

Usability testing focuses on measuring a human-made product's capacity to meet its intended purpose. Examples of products that commonly benefit from usability testing are web sites
Web design

Web Page design requires conceptualizing, planning, modeling, and executing electronic media content and its delivery via the Internet using technologies suitable for rendering and presentation by web browsers or other web-based graphical user interfaces ....
 or web applications, computer interfaces
User interface

The user interface is the aggregate of means by which people—the User s—Interaction with the system—a particular machine, device, computer program or other complex tools....
, documents, or devices. Usability testing measures the usability, or ease of use, of a specific object or set of objects, whereas general human-computer interaction studies attempt to formulate universal principles.

History of usability testing

A Xerox
Xerox

Xerox Corporation is a global document management company which manufactures and sells a range of color and black-and-white Computer printer, multifunction systems, photo copiers, digital production printing presses, and related consulting services and supplies....
 Palo Alto Research Center (PARC) employee wrote that PARC
PARC

PARC may refer to:* PARC , the Palo Alto Research Center * PARC Management, a theme park and entertainment venue operator* Parc, New York, a census-designated place named for the Plattsburgh Airbase Redevelopment Corporation...
 used extensive usability testing in creating the Xerox Star, introduced in 1981. Only about 25,000 were sold, leading many to consider the Xerox Star
Xerox Star

The Star workstation, officially known as the Xerox 8010 Information System, was introduced by Xerox Corporation in 1981. It was the first commercial system to incorporate various technologies that today have become commonplace in personal computers, including a raster graphics display, a window-based graphical user interface, icon , f...
 a commercial failure.

The Google Book Search preview, of the Inside Intuit book, says (page 22, 1984), "... in the first instance of the Usability Testing that later became standard industry practice, LeFevre recruited people off the streets... and timed their Kwik-Chek (Quicken
Quicken

Quicken is a personal finance management tool developed by Intuit, Inc.. Quicken runs on Windows and Macintosh systems. An online version is also available....
) usage with a stopwatch. After every test... programmers worked to improve the program.") Scott Cook
Scott Cook

Scott Cook started his career at Procter & Gamble, where he learned about product development, market research, and marketing. He soon began using the insights he was learning there to look for an idea for a company of his own....
, Intuit co-founder, said, "... we did usability testing in 1984, five years before anyone else... there's a very big difference between doing it and having marketing people doing it as part of their... design... a very big difference between doing it and having it be the core of what engineers focus on.

Cook may not have known of the PARC work, but it sounds more like he knew it only related to marketing design, as opposed to engineering and re-engineering decisions based on direct user input. In any event, at the time of this writing Google seems to have no Usability Testing projects between the PARC work and Quicken, but many after Quicken became a top commercial seller.

Goals of usability testing

Usability testing is a black-box testing technique. The aim is to observe people using the product to discover errors and areas of improvement. Usability testing generally involves measuring how well test subjects respond in four areas: efficiency, accuracy, recall, and emotional response. The results of the first test can be treated as a baseline or control measurement; all subsequent tests can then be compared to the baseline to indicate improvement.

  • Performance -- How much time, and how many steps, are required for people to complete basic tasks? (For example, find something to buy, create a new account, and order the item.)
  • Accuracy -- How many mistakes did people make? (And were they fatal or recoverable with the right information?)
  • Recall -- How much does the person remember afterwards or after periods of non-use?
  • Emotional response -- How does the person feel about the tasks completed? Is the person confident, stressed? Would the user recommend this system to a friend?


What usability testing is not

Simply gathering opinions on an object or document is market research
Market research

Market research often refers to either primary or secondary. In secondary research, the company uses information compiled from various sources which appears applicable to a new or existing product....
 rather than usability testing. Usability testing usually involves systematic observation under controlled conditions to determine how well people can use the product.

Rather than showing users a rough draft and asking, "Do you understand this?", usability testing involves watching people trying to use something for its intended purpose. For example, when testing instruction
Instruction

Instruction may refer to:* Education, the teaching and learning of knowledge* Teaching, a form of instruction* Sebayt, a work of the ancient Egyptian didactic literature aiming to teach ethical behaviour...
s for assembling a toy, the test subjects should be given the instructions and a box of parts. Instruction phrasing, illustration quality, and the toy's design all affect the assembly process.

Methods

Setting up a usability test involves carefully creating a scenario
Scenario

A scenario is a synthetic description of an event or series of actions and events. In the Commedia dell'arte it was an outline of entrances, exits, and action describing the plot of a play that was literally pinned to the back of the scenery....
, or realistic situation, wherein the person performs a list of tasks using the product being tested while observers watch and take notes. Several other test instruments such as scripted instructions, paper prototypes
Paper prototypes

Paper prototyping is a widely used method in the user-centered design, a process that helps developers to create software that meets the user's expectations and needs - in this case, especially for user interface design and usability testing user interfaces....
, and pre- and post-test questionnaires are also used to gather feedback on the product being tested. For example, to test the attachment function of an e-mail
E-mail

Electronic mail, often abbreviated as e-mail, email, E-Mail, or eMail, is any method of creating, transmitting, or storing primarily text-based human communications with digital communications systems....
 program, a scenario would describe a situation where a person needs to send an e-mail attachment, and ask him or her to undertake this task. The aim is to observe how people function in a realistic manner, so that developers can see problem areas, and what people like. Techniques popularly used to gather data during a usability test include think aloud protocol
Think aloud protocol

Think-aloud protocol is a method used to gather data in usability testing in product design and development, in psychology and a range of social sciences ....
 and eye tracking
Eye tracking

Eye tracking is the process of measuring either the point of gaze or the motion of an eye relative to the head. An eye tracker is a device for measuring eye positions and eye movements....
.

Hallway testing


Hallway testing (or hallway usability testing) is a specific methodology
Methodology

Methodology can be defined as:# "the analysis of the principles of methods, rules, and postulates employed by a discipline";# "the systematic study of methods that are, can be, or have been applied within a discipline"; or...
 of software usability testing. Rather than using an in-house, trained group of tester
Software testing

Software Testing is an empirical investigation conducted to provide stakeholders with information about the quality of the product or service under test , with respect to the context in which it is intended to operate....
s, just five to six random people, indicative of a cross-section of end users, are brought in to test the software (be it an application
Computer program

Computer programs are Instruction for a computer. A computer requires programs to function. Moreover, a computer program does not run unless its instructions are executed by a Central processing unit; however, a program may communicate an Algorithm#Formalization of algorithms to people without running....
, web site, etc.); the name of the technique refers to the fact that the testers should be random people who pass by in the hallway. The theory, as adopted from Jakob Nielsen
Jakob Nielsen (usability consultant)

Jakob Nielsen is a leading web usability consultant. He holds a Ph.D. in human-computer interaction from the Technical University of Denmark in Copenhagen....
's research, is that 95% of usability problems can be discovered using this technique.

Remote testing


Remote usability testing (also known as unmoderated or asynchronous usability testing) involves the use of a specially modified online survey, allowing the quantification of user testing studies by providing the ability to generate large sample sizes. Additionally, this style of user testing also provides an opportunity to segment feedback by demographic, attitudinal and behavioural type. The tests are carried out in the user’s own environment (rather than labs) helping further simulate real-life scenario testing. This approach also provides a vehicle to easily solicit feedback from users in remote areas.

How many users to test?


In the early 1990s, Jakob Nielsen
Jakob Nielsen (usability consultant)

Jakob Nielsen is a leading web usability consultant. He holds a Ph.D. in human-computer interaction from the Technical University of Denmark in Copenhagen....
, at that time a researcher at Sun Microsystems
Sun Microsystems

Sun Microsystems, Inc. is a multinational corporation vendor of computers, computer components, computer software, and information technology services, founded on February 24, 1982....
, popularized the concept of using numerous small usability tests -- typically with only five test subjects each -- at various stages of the development process. His argument is that, once it is found that two or three people are totally confused by the home page, little is gained by watching more people suffer through the same flawed design. "Elaborate usability tests are a waste of resources. The best results come from testing no more than 5 users and running as many small tests as you can afford." . Nielsen subsequently published his research and coined the term heuristic evaluation
Heuristic evaluation

A heuristic evaluation is a discount usability inspection method for computer software that helps to identify usability problems in the user interface design....
.

The claim of "Five users is enough" was later described by a mathematical model which states for the proportion of uncovered problems U

where p is the probability of one subject identifying a specific problem and n the number of subjects (or test sessions). This model shows up as an asymptotic graph towards the number of real existing problems (see figure below).

Virzis Formula
In later research Nielsen's claim has eagerly been questioned with both empirical
Empirical

The word empirical denotes information gained by means of observation, experience, or experiment, as opposed to theory. A central concept in science and the scientific method is that all evidence must be empirical, or empirically based, that is, dependent on evidence or Logical consequence that are observable by the senses....
 evidence and more advanced mathematical model
Mathematical model

A mathematical model uses mathematics language to describe a system. Mathematical models are used not only in the natural sciences and engineering disciplines but also in the social sciences ; physicists, engineers, computer sciences, and economists use mathematical models most extensively....
s . Two key challenges to this assertion are: (1) since usability is related to the specific set of users, such a small sample size is unlikely to be representative of the total population so the data from such a small sample is more likely to reflect the sample group than the population they may represent and (2) Not every usability problem is equally easy-to-detect. Intractable problems happen to decelerate the overall process. Under these circumstances the progress of the process is much shallower than predicted by the Nielsen/Landauer formula .

Most researchers and practitioners today agree that, although testing 5 users is better than not testing at all, a sample size larger than five is required to detect a satisfying amount of usability problems.

See also

  • ISO 9241
    ISO 9241

    ISO 9241 is a multi-part standardization covering a number of aspects for people working with computers. Although originally titled Ergonomic requirements for office work with visual display terminals it is being retitled to the more generic Ergonomics of Human System Interaction by International Organization for Standardization ....
  • Software testing
    Software testing

    Software Testing is an empirical investigation conducted to provide stakeholders with information about the quality of the product or service under test , with respect to the context in which it is intended to operate....
  • Educational technology
    Educational technology

    Educational technology is the study and ethical practice of facilitating learning and improving performance by creating, using and managing appropriate technological processes and resources." The term educational technology is often associated with, and encompasses, instructional theory and Learning theory ....
  • Universal usability
    Universal usability

    Universal usability refers to the design of information and communications products and services that are usable for every citizen. The concept has been advocated by Professor Ben Shneiderman, a computer scientist at the University of Maryland, College Park....
  • Commercial eye tracking
  • Don't Make Me Think
    Don't Make Me Think

    Don't Make Me Think is a book by Steve Krug about human computer interaction and WWW usability. The book's premise is that a good program or web site should let users accomplish their intended tasks as easily and directly as possible....
  • System Usability Scale
    System Usability Scale

    The System Usability Scale in systems engineering is a simple, ten-item attitude Likert scale giving a global view of subjective assessments of usability....


External links