Wednesday, March 16, 2016

First contribution to usability testing

This is a guest post from Ciarrai, who is applying for the usability testing internship.

In order to apply for an Outreachy internship, we ask that you make an initial contribution. For usability testing, I suggest a small usability test with a few testers, then write a summary about what you learned and what you would do better next time. This small test is a great introduction to usability testing. It exercises the skills you'll grow during the usability testing internship, and demonstrates your interest in the project.

Ciarrai's summary was an excellent example of usability testing, and I am posting it here with permission:

What is usability testing and why is it important?

When writing a piece of code, we often obsess about making our logic elegant and concise, coming up with clever ways to execute tasks and demonstrate ideas. What we sometimes lose sight of is the fact that we aren't just trying to craft well build software, we also need it to be useful to the (hopefully) many others who will be using it. “Useful” encompasses a complex scope, but usability is a large part of what makes something useful. To further simplify an intricate topic, when we're talking about usability were basically asking the question, “Can people easily use this thing?”

What if we could just give users the software in question, ask them to preform some of the general tasks associated with the program, and observe the results? Would this not likely be the most effective way to judge a program's usability? We're not asking people to describe an experience, we're actually watching and listening to them as they do it. This is what usability testing is all about and it is crucial to the creative process of building anything user-based. If people can't use our software, then all the hard work of creating it was in vain.

Usability testing gives us insight into the holes in our development game from the user perspective, but also lets us see what works well about our software. Both types of data are indispensable as we continue to design and modify new programs. We can start to see patterns and refine our approach in a way that hopefully makes the whole process of creating more effective.


For the usability test, I set aside a guest user on my machine, a laptop running the Fedora 23 operating system with the GNOME desktop. There are no modifications to the install that should affect the results of the test. The participants executed the test separate from one another using otherwise identical settings.

The scenario tasks I used were taken from previous usability tests. I used the six scenario tasks on Gedit from Jim Hall's blog. I also borrowed the four Nautilus scenario tasks from Gina Dobrescu's blog from a previous usability test internship.

I was given permission to use previous scenario tasks. One reason for using previously-tested scenario tasks was that I don't yet have the skills for coming up with the most appropriate scenario tasks for a usability test. Another thought I had though is that in using redundant scenarios, I am widening the breadth of test subjects for these scenario tasks which could be useful testing for these GNOME applications. Using these approved scenarios also meant that I could go ahead with conducting a usability test on a shorter timeline.

I introduced the test by letting the volunteers know that they would be testing out two GNOME applications: a file manager and a text editor. I let them know that they were not being judged, but rather I was looking to see how well the software works for them. I told them there was no time limit, that they should try to complete the tasks in a manner comfortable to them. I also stressed that if they felt a task too difficult that it was fine to stop, that they should only proceed as far as they normally would when trying out a new application. I asked that the participants talked aloud during the test, that they vocalize all their thoughts about the test to the best of their abilities. I said that I would be there to listen, observe and help with any general issues, but that I was not going to provide assistance in completing any of the tasks.

I sat next to each tester and watched the screen over their shoulders while listening to them describe the process. I reminded testers that they could take their time and that they could abandon a task if it proved too difficult, but I was otherwise silent as they worked.

Directly following each test, I recorded the test results using a heat map and made a few notes to myself for reference when doing this write-up. I didn't write anything down during the test itself because I thought that would make the participants nervous.


Overall, the testing proved to be quite interesting. Of the three testers, one works on Linux systems for a living and had experience with GNOME before. The other two use computers daily but were not particularly skilled users and were not familiar with the GNOME environment. The level of confidence with which they completed the tasks was strikingly different between the more and less competent users. What was very interesting though was seeing them all struggle on the same tasks. I think I got some real insight into a few things that could enhance the usability of Gedit.

So, what commonly seems intuitive about the software?

Well, no one seemed to have a problem writing, saving and copying a file in Gedit. I think it functions similarly enough to other software on the level of editing a note. One participant mentioned expecting the Save button to be on the upper left instead of right, but had no real trouble finding it.

As for Nautilus, using the search function seemed to be intuitive to all participants. The magnifying glass icon is well understood to be associated with search functionality. Participants seemed to understand what they were looking at and were able to navigate the home file system with ease. I think the takeaway for what went well regarding both applications was that they looked familiar enough and operated in a way that all participants were accustomed to on the basic level.

The difficulties in using both Gedit and Nautilus were similar, though to varying degrees across all participants. With Gedit, it seemed that everyone wanted the user preferences for fonts and colors to be located in the drop-down menu. I watched one participant navigate through each option in the menu drawer at least three times believing that font changes had to be located there. Two participants failed to ever check the Gedit → Preferences menu for fonts. The third used the Internet to locate the fonts tab for the application. Gedit seems to separate editor text tasks (find and replace for example) with editor preferences (color themes), but this seemed a non-intuitive divide for users.

With Nautilus, the difficulty seemed to be similar to that of Gedit. Testers had trouble changing preferences for how Nautilus works. I think that navigating to the the Nautilus → Preferences menu did not come naturally to most participants. I think that the tab being located above the application is maybe a more unique setup and it confuses testers who are used to other desktop environments. It also seemed as though participants wanted the layout preferences to adjust automatically when changed. It took users much longer to change Nautilius to list format because of this. One tester abandoned the task even though he figured out where preferences was located because he never tried closing the application.

The feedback from testers was generally that the tasks were either very easy or quite difficult, but that they all felt that they could use these applications to do their daily tasks.

What worked well in the test?

I think the test was successful in getting a small amount of data about the usability of Gedit and Nautilus. Having participants that represented different user groupings was helpful in getting a sense of how a variety of target users would react to GNOME. It was also helpful for me as the moderator to see where difficulty may have arisen through lack of familiarity with the task as opposed to the application.

All test subjects seemed to be mostly comfortable with the test, and were put more at ease by being explained that they were not being scrutinized for how well they accomplished the tasks. The testers gave their earnest efforts to complete the tasks. None of the testers took over 45 minutes to complete the test to the best of their abilities.

What were the challenges?

Participants reacted differently to being asked to talk their way through completing the tasks. One tester spoke aloud every action he was doing (ex. “Okay, now I am pressing alt + tab to navigate back to the file system”). Another seemed shy to say anything about what she was doing, especially as it became more difficult. This made each test experience unique in a way that was harder to judge the experience for the tester.

One participant mentioned feeling stupid after not being able to complete a task. I think it's valuable to hear this kind of feedback, though it was discouraging to know that people take it personally when they cannot execute a task they deem simple.

It felt strange to be so closely watching the testers as they worked. I found it difficult to restrain myself from helping the testers when they were having a rough time with a task.

I didn't know how to respond to the question of searching the Internet for answers when someone got stuck on a task. I could see how both allowing and disallowing the Internet were valuable decisions. In the end I decided for this test that participants should go through whatever normal channels they take when trying to complete a task with an application. This meant that one tester used an Internet search to complete a task while the others never did.


What comes to mind when thinking about conducting a test or experiment is accuracy. I think striving for more consistency is the best way to improve on future usability testing. One part of achieving accuracy would be removing known variables. The other would be adding functionality that yields more of the type of results that we want. With that in mind, my recommendations for future tests would be:
  • Give testers examples for how to talk through the execution of their tasks so that similar information can be gathered from all testers.
  • Record the tests so that I am not relying on memory when making comparisons and so that others could hear/view the test and interpret the results.
  • Get as much of a variety of users as possible to preform the test. This would include computer science professionals, different operating system users, those who use computers heavily but not for computer science, people of a variety of ages, etc.
  • Be clear on boundaries such as Internet usage during test.

These are the scenario tasks used in the usability test:

Gedit (GNOME Text Editor)

1. You need to type up a quick note for yourself, briefly describing an event that you want to remember later. You start the Gedit text editor (this has been done for you).

Please type the following short paragraphs into the text editor:
Note for myself:

Jenny and I decided to throw a surprise party for Jeff, for Jeff's birthday. We'll get together at 5:30 on Friday after work, at Jeff's place. Jenny arranged for Jeff's roommate to keep him away until 7:00.

We need to get the decorations up and music ready to go by 6:30, when people will start to arrive. We asked everyone to be there no later than 6:45.
Save this note as party reminder.txt in the Documents folder.

2. After typing the note, you realize that you got a few details wrong. Please make these edits:
  • In the first paragraph, change Friday to Thursday.
  • Change 5:30 to 5:00.
  • Move the entire sentence Jenny arranged for Jeff's roommate to keep him away until 7:00. to the end of the second paragraph, after no later than 6:45.
When you are done, please save the file. You can use the same filename.

3. Actually, Jeff prefers to go by Geoff, and Jenny prefers Ginny. Please replace every occurrence of Jeff with Goeff, and all instances of Jenny with Ginny.

When you are done, please save the file. You can use the same filename.

4. You'd like to make a copy of the note, using a different name that you can find more easily later. Please save a copy of this note as Geoff surprise party.txt in the Documents folder.

For the purposes of this exercise, you do not need to delete the original file.

5. You decide the font in the editor is difficult to read, and you would prefer to use a different font. Please change the font to DejaVu Sans Mono, 12 point.

6. You decide the black-on-white text is too bright for your eyes, and you would prefer to use different colors. Please change the colors to the Oblivion color scheme.

Nautilus (GNOME File Manager)

1. Yesterday, you re-organized your files and you don’t remember where you saved the copy of one of the articles you were working on. Please search for a file named The Hobbit.

2. Files and folders are usually displayed as icons, but you can display them in other ways too. Change how the file manager displays files and folders, to show them as a list.

3. You don’t have your glasses with you, so you aren’t able to read the names of the files and folders very well. Please make the text bigger, so it is easier to read.

4. Please search for a folder or a file that you recently worked on, maybe this will help you find the lost article.

No comments:

Post a Comment