Friday, December 9, 2016

Examining User eXperience

When I talk about usability and User eXperience (UX) I often pause to explain the difference between the two concepts.

Usability is really about how easily people can use the software. Some researchers attach definitions to it, like Learnability and Rememberability, but in the end usability is all about real people trying to do real tasks in a reasonable amount of time. If people can easily use the software to do the things they need to do, then the software probably has good usability. If people can't do real tasks, or they can't do so in a reasonable amount of time, then the software probably has bad usability.

User eXperience (UX) is more about the emotional attachment that people have when using the software. You can also consider the emotional response people have to using the software. Does the software make them feel happy and that they want to use the software again next time? Then the software probably has a positive UX. If the software makes users feel unhappy and not want to use the software again, then the software probably has a negative UX.

In most cases, usability and UX are strongly aligned. And that makes sense. If you can use the software to get your work done, you probably have a good opinion of the software (good usability, positive UX). And if you can't use the software to do real work, then you probably won't have a great opinion of it (bad usability, negative UX).

But it doesn't always need to be that way. You can have it the other way around. It just doesn't happen that often. For example, there's an open source software game that I like to play sometimes (I won't name it here). It's a fun game, the graphics are well done, the sounds are adorable. When I'm done playing the game, I think I've had a fun time. And days or weeks later, when I remember the game, I look forward to playing it again. But the game is really hard to play. I don't know the controls. The game doesn't show you what to do to move around or to fire the weapons. And for a turn-based game with a time limit, it's important that you know how to move and shoot. Every time I play this game, I end up banging on keys to figure out what key does what action. It's not intuitive to me. Essentially, I have to re-learn how to play the game every time I play it.

That game has bad usability, but a positive UX. I don't know how to play it, I have trouble figuring out how to play it, but (once I figure it out) I have a fun time playing it and I look forward to playing it again. That's bad usability and a positive UX.

And that means we can't rely on good usability to have a positive UX. We need to examine both.

When I mentored Diana, Renata and Ciarrai this summer in GNOME and Outreachy, Diana performed a UX test on GNOME. This was the first time we'd attempted a UX test on GNOME, and I think we all recognize that it didn't go as well as we'd hoped. But I think we have a good foundation to make the next UX test even better.

We did some research into UX testing, and one method that looked interesting was asking test participants to identify their emotional response with an emoji. Diana looked around and found several examples of emojis, and we decided to move forward with this scale:


The emoji range from "angry" and "sad" on the left to "meh" and "?" in the middle, to "happy" and "love" on the right. It seemed a good scale to ask respondents to indicate their emotion.

For the test, we identified three broad scenario tasks that testers would respond to. Each tester logged into a GNOME workstation with a fresh account, so they would get the "new user" experience. From there, each tester was asked to complete the scenario tasks. We intentionally left the scenario tasks as somewhat vague, so testers wouldn't feel "boxed in" with the test. We only wanted testers to exercise GNOME. The scenario tasks represented what new users would likely do when they used GNOME for the first time, and asked testers to access their email (we wiped the machine afterwards) and to pretend the files on a USB fob drive contained their files from an old machine and to copy these files to the new computer wherever they saw fit.

The scenario tasks took about thirty minutes to complete. Afterward, Diana interviewed the testers, including asking them to respond with the emoji chart. Specifically, "Thinking back to the first ten minutes or so, what emoji represents what you thought of GNOME" and "Thinking about the last ten minutes or so, what emoji represents what you thought of GNOME."

From those questions, we hoped to identify what new users thought of GNOME when they first started using it, and after they'd used it for a little while.

However, when we looked at the results, I saw two problems:

We didn't use enough testers. Diana was only able to find a few testers, and this clearly wasn't enough to find consensus. With iterative usability testing, you usually only need about five testers to get "good enough" results to make improvements. But that assumes usability testing, not UX testing. We need more than five UX testers to understand what users are thinking.

We used too many emoji. Ten emoji turns out to be a lot. If you go through the list of emoji, you may ask what's the difference between #3, #4, and#5. All seem to express "meh." And is there significant distinction between #8 and #9? Both are "happy." There may be similar overlap on other emoji in this list. Having too many choices makes it that much more difficult to read the emotions of users.

I spoke at the Government IT Symposium this week, and I gave three presentations. One of them was "Usability Testing in Open Source Software" and was about our usability tests this summer with GNOME and Outreachy. Attendees had great comments about Renata's traditional usability test and Ciarrai's paper prototype test. People also liked the heat map method to examine usability test results. When I talked about the UX test, attendees thought the emojis were a good start, but also suggested the method could be improved. I asked for help to make this better next time.

These are a few things we might do differently in the next UX test:

Use the emojis, but use fewer of them. Others agreed that there are too many emojis for testers to choose from. Also, that many options without significant variation means testers may ascribe a feeling to an emoji that you don't share. With fewer emoji, we should have better reproducibility. Some people suggested five emoji (similar to a typical Likert scale) or six emoji (more difficult to give a "no feeling" response).

Also ask testers to name their feeling. When we ask testers to identify the emoji that represents their emotional response to part of the test, also ask the testers to name that feeling. "I pick X emoji, which means 'Y'." Then you have another data point to use in describing the UX.

Use a Likert scale. One researcher suggested that UX would be easier to quantify if we asked testers to respond to a traditional five-point or six-point Likert scale, from "hate" to "love."

Use word association. I had thought about doing this before, so it was good to hear the suggestion from someone else. When asking testers to talk about how they felt during part of the test, ask the testers to pick a few words from a word list. Say, five words. The word list could be created by selecting a range of emotions ("love" and "meh" and "hate," for example) and using a thesaurus to generate other alternative words for the same emotion. Sort the list alphabetically so similar words aren't grouped next to each other. We'd need to be careful in creating the word list like this, but it could provide an interesting way to identify emotion. How many "love" words did testers select versus "hate" words, etc? One way to display the results is a "word cloud" on the base words ("love" and "meh" and "hate," in this example).

Mix UX testing with usability testing. In this UX test, we decided to let users experiment with GNOME before we asked them what they thought about GNOME. We provided the testers with a few broad scenario tasks that represented typical "first time user" tasks. After testers had experienced GNOME, we asked them what they thought about it. Others suggest it would be better and more interesting to pause after each scenario task to ask testers what they think. Do the emoji exercise or other UX measurement, then move on to the next task. This would provide a better indicator about how their emotional response changes as they use GNOME. And it could provide some correlation between a difficult scenario task and a sudden negative turn in UX.

Add more UX questions. When I asked for help in UX testing, several people offered other questions they use to examine UX. Some also shared questionnaires they use for usability testing. And there's some crossover here. A few questions that we might consider adding include "How would you describe the overall look and feel?" and "What is your first impression of X?" and "Talk about what you're seeing on this screen and describe what you're thinking." These questions can be useful in usability testing, but also provide some insight to UX.

No comments:

Post a Comment