Jared M. Spool is a co-founder of Center Centre and the founder of UIE. If you’ve ever seen Jared speak about user experience (UX) design, you know that he’s probably the most effective and knowledgeable communicator on the subject today. He started working in the field of usability in 1978, before the term “usability” was ever associated with computers.
While he led UIE, the industry research firm he started in 1988, the field of UX design emerged and Jared helped define what makes UX designers successful all over the world. UIE’s world-class research organization produces conferences and workshops all over the world and for companies in every industry. You’ll also find Jared as the conference chair and keynote speaker at the annual UI Conference and UX Immersion Conference, and he manages to squeeze in a fair amount of writing time. He is author of the book, Web Usability: A Designer’s Guide and co-author of Web Anatomy: Interaction Design Frameworks that Work.
IA Summit 2015
Topic(s): analytics and metrics
The world of metrics and analytics has always been at odds with how designers work. Design is a process where we finely tune our gut intuition to create a great user experience. Yet, sometimes, the measures we take indicate a different outcome. Which do we believe? Our gut or what the computers are collecting?
In this presentation, Jared will explore the world of measures, metrics, and KPIs. He’ll share the techniques behind Amazon and Netflix’s success. He’ll show how some practices, like the growth hacking approach to increasing Monthly Average Users (MAUs) have hurt the online experience of Instagram and LinkedIn. Plus, you’ll see some alternatives to satisfaction and net promoter score that give insight into the design process and can help designers better tune their gut intuition.
- What do easily-collected analytics like bounce rate and time-on-page actually tells us about our users experiences?
- How do we construct true Key Performance Indicators (KPIs) that can predict the future patterns of users?
- Why advanced techniques, like a money-left-on-the-table analysis and the CE11, show us how much more powerful metrics can be in design?
Jared Spool: Let’s talk about metrics.
Jared: [laughs] Wow. Who would’ve thought? In 2010, Australian designer named Luke Stevens decided that he was going to make a major change in his life. He was going to shift away from just being a freelance designer to writing a book and the book was going to be on data driven design.
As many designers who decide that they’re going to write the book do for their very first activity, he decided to redesign his personal website.
But, because this was a book on data driven design he decided he would actually use data to decide how to design it. He came up with an A/B test. Variant A was about the book.
It was a complete description of what the book would be about as he imagined it, with an enticement that, if you were interested you could put in your email address. As soon as he had information about the book, he would contact you.
Variant B just asked, “Are you a designer?” Said, “He was working on a book for designers.” With just that amount of information, asked you to put in your email address. His expectation was that people who were interested in this topic would be more likely to give their email address if they knew what was in the book.
Actually, what he found when he compared the results was that the first variant, the one that described the book, collected 33 email addresses out of the 600 some odd people who came to the site.
The second variant collected 77 email addresses. In his blog post about this he declared that, “This is why we have to design with data. The data tell us what’s going on. The data tell us that this design is better. It converts more.”
That’s interesting because basically what he said was that, “More addresses are better.” That’s an interesting supposition. The other thing he said was in essence that “All email addresses are equal.”
That’s also an interesting supposition because what we are assuming here is that the people who put in their email addresses, not knowing anything about the book, are just as likely to buy the book as the people who put in the email addresses knowing about the book.
If in fact he sent these people email about the book, which one would generate more sales? That wasn’t what he was measuring. What he was measuring were email addresses. We will never know which one would sell more because he never wrote the book.
Jared: That’s not what I wanted to talk to you about. What I wanted to talk to you about was the email addresses. The email addresses and these two interesting assumptions that he put in his blog posts, that in fact these things are the right things to measure.
Now, these things are structurally different. The first one is what we would call an observation. It’s the actual thing we can observe. The second group are inferences. Those are the things we infer from the observation and from the experiment itself.
It turns out these are two essential elements to what we do as designers. We conduct research and we get observations and then from those observations, we glean inferences and from those inferences, we go with our design decisions. Based on the inferences he made, he would choose one of those design elements. That’s how we do this.
We can take this apart and draw this out and say in fact that the second variant had more email addresses, therefore more email addresses are better and therefore we should go with the second variant.
That’s the process that we use. Really, what we’re saying here is what did we see and what does it mean? This is a website, a version of the Wells Fargo website, from 2004. You have to have the context of this.
The previous Wells Fargo website had a gray background. That’s how old this time period was, and this was a radical departure for Wells at the time. They were studying it in incredible depth.
We were interested in what they were learning because it was fascinating the way they were looking at it. They were looking at every possible piece of data they could get. One of the pieces of data they decided to dive into was the search log files.
They were fascinated by the search log files and what was fascinating them about the log files was that now they could learn all sorts of things about what search could tell them, what were people looking for.
Before they jumped in, we were hypothesizing what would we find when we dug into the search log files. We thought we see lots of people searching for automatic teller machines and lots of people looking at mortgage rates and all of these things.
It turned out the most popular thing in the log file, the most entries that were searched for was nothing. Blanks. Search was continually being used with no search query. Now we had to figure out why, and there were a bunch of inferences that came out of this.
One theory was that it had to do with the user name and password field, that people were typing in the user name and the password. They were hitting the “enter” key, and because the focus was set wrong on the page, they were doing a search with an empty search field. That was a good theory.
Another theory was that the user had not entered any search whatever. On the previous site, there was no box to search in, and this was still, to some extent, a new pattern for users.
They thought, “Well, maybe there were a lot of users out there who were hitting ‘search’ hoping it would then ask them what to search for.” Another theory was that their users were special, and they were looking for advanced search.
Jared: That what they really wanted was to be able to use those Booleans. That’s actually what a participant in one of our studies called it. “I want to use Booleans.”
Jared: They were looking for the Booleans, and for that, they needed the advanced search. Then there was a small contingent, but a sharp contingent, that had decided that maybe this wasn’t a user issue at all.
Maybe the log file was just not recording properly, and that there was something that was actually just putting blanks in the data. So that was it. We have four different theories based on one observation. We can map this out. Our observation is that the file is filled with blanks. And each of our four theories are there.
Now here’s the thing. The design decisions based on each of those four theories are completely different. What we end up doing depends on which of those inferences we believe. The problem we have is that we don’t know which inference to choose.
Interestingly enough, almost every observation falls into this problem, but for years, none of us have noticed, because most of us do the same thing. Whatever the first inference to come to mind is, that’s the one we run with. So whichever was the first of these to come with, that would be the design decision we’d go with.
Of course, that’s wrong, because we’re not considering what it really could be. We have no evidence to suggest that whatever we thought it was at first was in fact the right thing. Whatever we end up doing for a design decision is more likely to be a mistake than an actual improvement to the site.
Best designers, they don’t stop at the first inference. They keep going. They try and figure out all the different possible things it could be, and then they work with that. The way they work with that is they do more research.
You can take an inference and you can actually conduct an experiment around that inference. Let’s say we watch users. Suddenly we know that users didn’t know how to type a query into the box. That immediately eliminates three out of four inferences.
The tools we have, research, turns inferences into observations, which allows us to make better decisions. I recently collected some data from some home pages of major web properties. This is the data. I want you to guess what it was I collected.
Now, I’ll give you a hint. This is tightly correlated with the valuations of these companies. What do you think the variable I’m measuring here is? Any guesses? Shout it out.
Audience Member: Audience!
Jared: Audience. Yeah, this is a home page measure. So specific to audience, what do you think? Size? Age of audience? maybe.
Audience Member: Number of words.
Jared: Number of words on the page. Maybe. What else?
Audience Member: Ads.
Jared: Number of ads on the page.
Audience Member: How much blue.
Jared: How much blue is on the page. You’re getting close. [laughs] Really? That’s where we’re going with this?
Jared: Wow! You guys are sharp, because it’s actually the number of instances of the letter “E” on the page.
Jared: Snapchat is devoid of Es for the most part. Airbnb has a few whereas Uber is tops with Es and Facebook there E out to wazoo.
Here’s the thing. If I were to show this talk to a bunch of VC people, they’d immediately get on a phone and tell their properties to start increasing the number of Es on the page because it’s tightly correlated with evaluation.
Jared: Correlation is not causation. There’s another truth which is counting the letter E is a stupid metric.
Jared: We can clearly see that it’s a stupid metric. It’s almost as stupid as the color blue. Yet, it’s what we do. Not counting Es, but using stupid metrics. Let’s take this word apart, metric. What do we actually mean when we talk about that?
It turns out there’s a difference between a measure and a metric. A measure is something we can count. Something we can measure. Anything can be a measure. Interestingly enough, almost anything could be a metric because a metric is something that we track.
That’s all it is. A metric doesn’t necessarily mean it has any relevance to anything we do. There’s another term we use all the time, analytic. Analytic is a measure that software can track.
There are lots of measure that software can’t track. There are lots of metrics that we can’t get out of our analytics. Analytics are things the software can do.
These is an analytic. It’s time on page. This particular analytic was for a two-month period after I had written a fairly popular article. I want to draw your attention to December 17th. Where is there a four times jump in time on page on December 17th? Any theories?
Audience Member: Christmas break.
Jared: Christmas break, people are tuned out at work that it’s the pre-Christmas boredom period?
Audience Member: They opened a tab to your article and they were doing their [inaudible 13:40] Christmas shopping in another article. Every time they’re lost [inaudible 13:42] back.
Jared: They opened it in my article then went shopping and my article was the thing they went to every time the boss walked by.
Jared: You work for the government, I can tell.
Audience Member: [inaudible 13:58] United Airlines variable [inaudible 14:01] .
Jared: Why would right before travel season cause people to spend more time reading an article about United Airlines then…
Audience Member: Because of what you said in the article.
Jared: Because of what I said in the article. I had nothing to do with United Airlines. Actually, I don’t usually write about United Airlines because I don’t want to give them the love.
Jared: My theory is that December 17th was International I Have to Go Pee More Often Day. That people were just peeing more often primarily while reading my article.
Jared: That’s the only thing I have. None of these make sense because time on page, by itself when it changes, makes no sense whatsoever. There is no meaning to the metric of time on page.
The only reason we get this metric is because it’s an analytic. It comes in the package of analytics that we get and it’s because computers can measure it easily. Some engineers decided, since we can measure it, we should give it to the users and they’ll figure out what to do with it.
That’s what it is. That’s what almost every analytic gives. Almost all of them fall into this category of things we don’t know why we have them.
Let’s take another popular one. One that’s the darling of SEO people, bounce rate. This is the bounce rate on that article for the same period. You can see that on December 17th, it did not keep any more people even though they had to pee. Nothing about this article did that.
The thing about this metric is we have variation from one day to the next. What do we do differently to cause that variation to be different? For one thing, is a lower bounce rate better or a higher bounce rate? A bounce rate means the person coming to the site then not doing anything else on the site.
It might mean that their article is obviously not engaging because people are coming to the site then leaving. It might be that the article is exactly what the person wanted and they’re coming to the site and they’re done that they’re leaving.
Which of those is the thing we’re fixing? What’s the inference here? There’s no inference data that means anything that we cannot do anything with this analytic. By itself, it is useless. It turns out in combination with any number of other things, it is still useless.
The thing about it is it is data and we can make it tell any story we want. Data like this I refer to as an agenda amplifier. People come to the data with their agenda.
They look at it and go, “Oh my God, that bounce rate is awesome. It’s supporting exactly what I want you to do,” or, “Oh my gosh, we have to fix that bounce rate. It’s supporting exactly what I need you to do.”
It’s an agenda amplifier and we use it for whatever agenda we want. The beauty of it is it will tell any story we want. The reason it will do that is because if we torture data long enough, it will confess to anything we want.
Google Analytics fails us on a regular basis because it can’t tell us the things we need as designers to work. It can’t tell us if the content we’re producing is useful. It can’t tell us who the most important user on this site is.
It cannot tell us what those users do that are different from other users that we can potentially design differently. It doesn’t tell us why somebody quits. It can’t tell why anything is happening on the site. Google Analytics is not the only criminal here. There are other things we use like conversion rate.
Conversion rate is a particularly nasty one because it’s a ratio. As a ratio, it has two things we can manipulate. The number of people who purchase and the number of people who visited. What we do is we take the number of people who purchase then the number of people who visited.
Let’s say a million visitors come. We divide that into the number of people who actually purchased on the site and we get ourselves a ratio. In this case, one percent of the people who came to the site.
Of course, if we want to get more people to purchase, we would want this number to go up. We can use this to track whether we’re, in fact, getting that, correct? Here’s the problem. It’s a ratio.
One way to get the ratio to go up is to take twice as many people, get them to purchase over the same number of visitors. That would get me a two percent conversation rate. Unfortunately, I can also get a two percent conversation rate by taking the same number of purchasers and dividing into that half of the visitors.
By actually reducing my marketing spend, I can increase conversion rate. This is the thing. A few months ago, I was sitting in front of a room full of executives of e-commerce sites.
I said, “I know exactly how to raise your conversation rate dramatically tomorrow. Just stop spending on marketing. Reduce the number of visitors. Your conversation rate will skyrocket, because the only people who come to your site are your most loyal customers. They’ll keep coming, because they don’t care about your advertising spend.”
They all went, “Wait, what?”
Jared: I had to draw it out. I had to do the math. I had to say, “Look, 10,000 times 100, average purchase price is $1 million. Sure enough, if we get 20,000 people to purchase, that brings it to $2 million, but if we only have 10,000 people it’s only $1 million.” They’re like, “These numbers, they confuse me.”
“Let me make this clearer. I have a two percent conversation rate at $2 million and a two percent conversion rate at $1 million.” “Wait, what?” “Here’s the deal. Behind door number one is conversion rate. Behind door number two is revenue. Which one do you want me to increase, because they are independent of each other?” “Uh, revenue.”
They had to think about it. “Then why are we talking about conversation rate?” Conversion rate is a complete red herring. It tells us nothing.
Trying to optimize for conversion rate is like trying to optimize for the number of emails we collect. It’s only an intermediate step. It’s not the end result. It’s not the objective of the organization.
No one gets big rewards on conversion rates. Something else has to happen. We need to focus on the thing that has to happen. It gets worse, because people don’t even know how to count conversion rate.
If I’m the type of product that you come to the site four times before you finally make the purchase, say a car, or an insurance policy, or something like that, do you count those four visits with one purchase as a 25 percent conversion rate?
Or are those four visits with one purchase with one person buying a hundred percent conversion rate? Turns out, our software can’t tell the difference most of the time, so we don’t know. What are we actually measuring?
We’re not measuring anything useful. If we have this multi-visit, intensive purchase and we optimize only for the last visit, we are screwing our customers. Here’s the thing. We don’t know what to do to fix it based on just conversion rate. We don’t know the lie.
Let’s play another game. I’m going to show you a bunch of words. See if you can pick out the one that’s different. We have delightful, amazing, awesome, excellent, remarkable, incredible, and satisfactory.
Does one seem to jump out to you? I’m going to bet it’s satisfactory. Satisfactory is not a particularly good endorsement of something.
It’s the marketing equivalent of a restaurant declaring their food to be “edible.” Nobody ever says, “Oh my god, that meal last night was amazing. It was so edible.” Nobody says that. When we shoot for satisfactory, we are setting our bar incredibly low.
We are not doing anybody any favors by creating these massive satisfaction surveys where we ask people to rate things in terms that they are unlikely even to understand like this airplane Internet survey that asked me what the ease of connecting to the Gogo Inflight signal was. How do you even being to break that down?
What would be the difference between very satisfactory and somewhat satisfactory in choosing your network SSID? What would we do to get it from somewhat satisfactory to very satisfactory if, in fact, the average was just that, somewhat satisfactory? This is the problem.
When we construct these scales, we often start with a neutral, and then we say, “We’re going to be satisfied, dissatisfied, and then we’ll add some criteria here that somehow makes a difference.” This is never calibrated with the user, so we don’t actually know what’s going on.
The minimum we could do is we could just take satisfied and make that the neutral point. Then, on either side, put delighted and frustrated and go with that. Hey, either it was really frustrating or it was really delightful. I’m not really sure what the difference is between somewhat delighted and extremely delighted is.
Let’s just with the fact that some people don’t like to choose extremes in surveys, so we’ll just add them. Then we get into this crap, the 10-point scale. Here’s a pro tip. Nothing communicates “I don’t care” like a 10-point scale. [laughs] Here’s another pro tip. 10-point scales make noise feel like science.
What is the difference between a six and a seven in a 10-point scale? What do you do differently? Last week we were getting sixes. Now, we’re getting sevens. We win. What is that? Don’t even get me started on Net Promoter.
Net Promoter, which uses an 11-point scale, is all about trying to figure out whether the people at the top of the scale are dominating the people at the bottom of the scale. That’s really what it’s about, but it’s all fake science. There’s nothing real behind any of this, because none of this can tell us what we do differently in our design.
When our Net Promoter score goes down, we don’t know what we broke. When our Net Promoter score goes up, we don’t know what we fixed. We don’t know. The fact is it changes all by itself when we don’t do anything. These metrics don’t work. What we need are metrics that help us improve our experience.
The thing is we already have the tools. We already have what we need. We just have to use them to our advantage. The place to start is the journey map. The journey map is a way for us to take the activities that a user does in our designs and actually rate them on a scale of frustrating to delightful.
From there, we can then map out what parts were frustrating and what parts were delightful. This, by itself, has value because we can hone in to the parts of the process to find out what is causing the frustration or what we did that was delightful.
Using plain old observational techniques like usability tests and field studies, we can produce these things and know what we’re supposed to do differently. We can hone in on all the things that we build into our designs that make customers frustrated, and then we know what to do to fix these things. We know how to work on this.
We can dive pretty deep into what these things might be. For instance, we can look at error messages. We can look at the error message that comes up that tells you that you left spaces and dashes in your phone number.
This is the most boggling error message to me, because it actually takes 10 lines of code to produce that error message where it only takes one line of code to actually remove spaces and dashes from phone numbers.
Jared: Or the error message that tells us that even though we put the security code in the last time we pressed submit, it got removed when we got the error about the phone number, and now we have to enter it again, and we didn’t notice that. The most canonical of evil error messages, “Username and password does not match.”
We were asked to work on the checkout process of one of the world’s biggest e-commerce sites. They had made a lot of changes. They had thought they could do better, but nothing was moving their numbers. They came to us and said, “What can you do to help?”
The first thing we did was we said, “What’s the checkout process?” The checkout process is basically a five-step process that is the same on almost every e-commerce site. There was nothing special in theirs. We said, “OK, how do you know that this needs improvement?”
They brought out their page view data which showed a steady decline from the first page of the checkout process all the way through. There’s nothing in this data that told us why this was happening, so we said, “We’ve got to do usability tests.”
One of the things we know about usability testing e-commerce sites is that you can’t have people pretend to buy things. You actually have to have people really buy things.
When they pretend to buy things, they actually behave very differently than when we really buy stuff. If you want to know what’s happening on a site, you have to have people really buy stuff.
We said, “We’re going to run usability tests, but we’re going to have people really buy stuff. Any chance we could get the page view data so we can compare that and we know what we’re working against here?” They said, “Sure.” They gave us the page view data for the points up until pressing the checkout button on the chopping cart and then the checkout process.
It was this fascinating dip that happened that really surprised us. We said, “You know, this is not that big a dip. We should worry about this dip, this thing between the checkout button and the first screen of checkout.”
They said, “No, no, you don’t need to worry about that. You don’t need to worry about that, because this is a problem on all e-commerce sites. 75 percent of all e-commerce sites have abandoned shopping cart issues.”
Abandoned shopping cart issues. This has always struck m me odd. I don’t understand why if this is so popular on e-commerce sites we don’t see this in real life.
I expect to go in the grocery store and see all these abandoned shopping carts as if the rapture had just happened and only the sinners were left shopping. Yet, that does not happen. People do not abandon their shopping cart in the grocery store, so why does it happen online?
They said, “No, everybody does it. This is the way it works. Anyways, you don’t need to worry about this. We have a plan to fix it. Our marketing department’s going to fix it. They’re just going to email everybody who abandons the cart until they purchase.”
Jared: That’s out of scope. Check. Got that. Then, we decided to look at the usability test. What we found in the usability test was that there were things that were happening between review shopping cart and entering the shipping information.
In particular, the big thing that was happening was that people suddenly have to log into an account. They didn’t tell us about this and we hadn’t noticed it because frankly we were avid shoppers on this site and we didn’t have to log in. It was cookied on our machines.
As soon as we were in the test in the lab we saw this happen because those machines didn’t have the users’ cookies. Therefore, they have to log in. What was happening then was that they couldn’t figure out what their username and password was because people change their usernames.
It was an email address and people couldn’t remember which email address they had signed up on it. They couldn’t remember which password they’ve used. Some of these accounts were created years before that they had to go to the reset password request then they would have to get an email link then they would have to put in a new password.
It was all those steps that were happening. We’re in the lab and we’re watching and we’re thinking, “Wow! OK, maybe that explains this big dip,” but we still didn’t think too much about it but we wanted to sort of see what was the footprint of this that we said, “Can we get the page view data for these things?”
They immediately said, “Yes,” followed by an immediate, “No,” because it turns out no one had ever thought to instrument any of these steps, and followed by a not so immediate three-week pause that then they came back and said, “OK, we have the data.”
What the data showed us was that that log in to the account page actually had three times the number of views as clicking the checkout button page.
That was weird because the only way you can get to that page is by clicking the checkout button. Why wouldn’t it just be one to one views? Why would it be three times as many? That made no sense.
Then we noticed that the reset password link was this huge dip followed by another big dip around clicking on the password in the email to set it followed by a smaller dip around putting in a new password and a larger dip around getting to the shipping information.
It was the usability test that explained why this was happening. It turns out that three times increase was because they were counting each time the person failed to put in the right password and username. Those were each of the error messages.
The password reset was because many people did not remember which email address to put in. The failure to click on the link was because they were not getting the emails.
It wouldn’t give you an error message if you put in a bad account because the security people that Tim Hack said, “Well, that tells the bad guys who has accounts and who doesn’t.”
They gave no feedback, whatsoever. People would sit and wait for emails that never came. Then we ask another question. This turned out to be the million dollar question. How much money was in the shopping carts of all the people who didn’t click through that process?
When we looked at that, the number came to $300 million. We were wrong. It wasn’t a million dollar question.
Jared: It turned out that within a few months, the team had been able to put in what called guest checkout and that increased revenue by $300 million. Our prediction was accurate.
The way we got to that $300 million was by combining the qualitative usability research with the quantitative custom metrics. Notice in here page use was almost useless. Almost, not completely.
The real money was on unrealized shopping cart value. When went senior management to get guest checkout approved, it wasn’t page views that won the day. It was unrealized shopping cart value. That was something that the analytics product never collected. We had to write our own code to get that number.
It turns out all the important metrics require that level of effort. The way we succeed at using metrics is by driving the quantitative findings from our qualitative research. They have to work together.
Here’s the deal. The initial inferences that the team had were wrong. They assumed a whole bunch of things and were acting on those inferences. It wasn’t until we actually sat down and compared the qualitative and quantitative data, sure enough, our experience journey told us right off where the problem was.
We did not have to go much further than that. We kept assuming that what we were seeing in the lab was not accurate with what was happening in the real world. It turns out we were wrong about that. It was completely accurate with what was happening in the real world.
We can look at the frustration in this case and that told us what was happening with the data and we could then recover $300 million. Observations trump inferences. The deal here is people give me excuses why they can’t do this. The analytics people work in a different department.
That’s the way it’s set up in a lot of places. We own that qualitative user research at the top. The analytics, marketing research, market insights, customer experience stuff is owned by a different team.
That’s got to stop. We need to take ownership of that stuff. They can have it for their market planning. The behavior stuff, we need that. To do that, we need to get smart like this dude. This is DJ Patil. He is the chief scientist of the United States. He works for the president.
The president has realized that he needs data, so he hired this guy. Data science is now an essential skill for every UX team. I completely agree with them.
Jared: It’s no longer acceptable for us to just say, “Well, I don’t understand these metrics. Somebody else is understanding these things.” If you don’t understand them, it’s probably because they don’t mean anything.
It’s no longer acceptable to say, “I’m not good with numbers.” We don’t need sophisticated numbers. We did not have sophisticated numbers to get to $300 million. The only operation we used was the plus sign.
It turns out we don’t need to do a lot of math. We just need to do some arithmetic. This is key. This is about design and design is the rendering of intent. Twitter created something called Twitter Cards that lets you see pictures and other things in your Twitter stream.
When the guys on the space station tweet images from the space station, you see those images right there on the Twitter stream. You don’t have to do anything to get it. It creates a great experience for the Twitter user.
Some pictures, like those from Instagram, don’t show up. When Mike Montero posts something from Instagram, you can’t see it in your Twitter stream. You have to click to see it. Trust me, it’s worth doing.
Jared: Why is it that Instagram doesn’t do this? It turns out Instagram used to do it up until the day they were acquired from Facebook. Everybody assumed that Twitter had turned it off because they were acquired by a competitor but that wasn’t true. Twitter never turned it off. It was Facebook that turned it off.
The reason Facebook turned it off is because of a metric. A metric called monthly average users. The entire Silicon Valley environment now is focused on this metric. It’s the number of users that use your product every day.
Twitter knows who is reading data from the stream, but Instagram does not know who’s reading data from the stream that it cannot count that as a monthly average user. In order to get a better monthly average user rating, which by the way correlates strongly to market evaluation, they decided to make you click.
If you click, you are now a user. This click and you’re now a user thing is happening everywhere. LinkedIn used to put the text of a reply in the email message, now you have to click. Facebook used to put the post that someone is commenting on in the message, now you have to click to find out what the post is about.
This is what happens when we let design be driven by the metrics. In this case, design is being driven by monthly average users. In order to increase monthly average users, we screw with the user’s experience.
Remember, the medium of design is behavior. If we want to have users have great behavior, we have to make sure we’re designing for the right things. We’re seeing some evidence of this. Things like medium measures the views but they also interestingly measure the reads and they measure the recommendations.
They put as much emphasis on reading and recommendation as they do on just plain views. In fact, they actually give you a little bit of wait in terms of how many people are reading all the way through the piece.
It’s a simplistic metric. It’s flawed because their definition of reading is someone who scrolls slowly to the bottom. It’s an interesting metric in that it’s trying to get an actual behavior. We have to have our metrics around behavior.
Let’s go with how people feel about us. We know how people feel about Apple when they are the first to line up at four o’clock in the morning to get whatever product is being released that day.
Interestingly enough, these are the same people who are very vocal about how much they thought the product was a disappointment during the keynote when it was announced. Yet, here they are sitting in that line.
We know how much people feel about Harley Davidson when they go so far as to have the Harley Davidson logo tattooed to their body. This is branding in the most primitive of notions. These are people who are truly branded for Harley Davidson.
We know how people feel about United. At least I do. When they decided that they’re going to communicate to the world their disappointment at United Service. We don’t capture that information here. We can’t tell any of that here. This doesn’t work.
There is in fact an instrument that does. It was created by the folks at Gallup. It’s written up in this fabulous article called the “Constant Customer” and it’s called the “Gallup CE11”. It’s basically an 11-question instrument that tells us exactly how people feel.
For example, one of the questions is do we agree with the statement this product’s company always delivers what they promise? We can certainly see that United would not score well in this for Alton Brown.
This question under pride which says, “I’m always proud to be a customer of this product’s company.” Harley Davidson customers have definitely shown that they agree with this. This question that says, “I can’t imagine a world without this product’s company.”
All those people lining up for Apple products would probably say that they agree with this statement. There are people out there who would probably agree with this statement if it was from Microsoft too. There just aren’t as many.
The CE11 asks 11 questions this way. It’s interesting. It’s a three-point scale. We can assign a plus one for agree and a minus one for disagree. Sure enough, we get this lovely 11-point positive, 11-point negative scale. What we then see is change.
Here’s an example. We studied five different companies, watching people shop for major electronics on each of these products, often laptops, things like that. People actually buying what they wanted.
We would instrument before and after with the CE11 and what we saw was for most of the brands, it dropped, except for Walmart where it went up because people’s expectations were so low to begin with. Anything was an improvement.
Jared: Here, we can see the change but here’s the really cool thing. We can see why the change happened because the way we collected this data was by sitting next to people in usability studies.
We could actually see the change and we could see why they were rating them the way they were. We could see what happened. Our metrics have to drive us to eliminate frustration and deliver delight. Delight and engagement which is what the CE11 measures go hand-in-hand.
You can learn more about the CE11. I’ve created a short link for it on the Gallup site, bit.ly/gallup-ce11. I highly recommend you look at it. It has all 11 questions and it’s fantastic. That’s what I came to talk to you about.
We need to make sure that our metrics are actually helping us improve the experiences we’re trying to work on. We can’t use the out of the box metrics. We have to focus on our own.
We have to start by being very conscious of when we jump inferences and making sure we’re exploring all the inferences and using our observational skills to bring those home. We have to customize the metrics that we use to actually be around the business problems and the experience challenges we’re trying to solve.
Finally, data science is now an essential UX skill. It has to be something that we’re hiring for. It has to be something that we’re teaching for. If for some reason you found this to be the least bit interesting, I have a whole bunch of stuff written about this on our website at uie.com.
If you are not connected to me on LinkedIn, I don’t know why not but please do use this email address. It’s a great way for me to find out more about what you’re doing and feel free to pop me questions there.
Finally, you can follow me on Twitter where I talk about design, design strategy, design education and the amazing customer service techniques of the airline industry.
Jared: Ladies and gentlemen, thank you very much for encouraging my behavior.
Jared: Thank you.