How to Fix Uber and Lyft’s Rating System

Harry here. The idea of a rating system sounds great on the surface, but the way it’s executed today leaves drivers wanting more. Today, senior RSG contributor John Ince shares what it’s like to be a driver in this system and how it can all be fixed.

I vividly remember my first introduction to ridesharing almost three years ago. I was meeting with my Lyft “mentor” who was supposed to give me a test drive and answer any questions I had about becoming a driver. But he seemed rushed and distracted, which I later discovered was because he was online – waiting for a ride request. He hurriedly showed me the basics of using the driver app, including rating system, “The passenger rates you from 1 to 5. These ratings are important because, if your average of the last 100 rating drops below 4.6, you’re in jeopardy of being de-activated.”

A Harsh System

“Wait,” I said, “With the bar set that high, it sounds like one bad rating from a frustrated passenger could kill your average.”

“That’s right,” he said. “It’s a pretty harsh system.”

His comment lingered in my brain as his phone pinged with a ride request. He tapped to accept and hurriedly said, “I gotta go.”

Standing there feeling confused and alone, I did a some mental gymnastics. Let’s say you’re a 4.8 rated driver and one passenger is in a really bad mood. Let’s say you did something minor, like maybe took a wrong turn, or made some offhand comment that was taken the wrong way, or you didn’t notice a mess a prior passenger left in the back seat. Even though you didn’t do anything really bad, they still rate you one star.

Figuring this out, a 5 star rating would add plus .2 to your average, but a one star rating would subtract 3.8 from your average. So let’s see… to recover from one 1 star rating, you need 19 five star ratings. Yes, that is a very harsh system.

The more I drove, the more I realized that it’s not only harsh – it unfair too. A lot of passengers have no idea the system is that harsh, so they routinely rate driver with a failing grade – even for a good ride. Awhile back I started taking a poll of my passengers. I would say, “Hey I’ve got a question. You know the ratings system, do you think 4 stars is a good rating?”

About two thirds said, “Yes.”

By the way, here’s a good video that explains this phenomenon:

Different People and Different Areas Use Different Standards

One can easily understand why many passengers think 4 stars is a good rating. On Amazon 4 stars is pretty decent. On Yelp, I once had a passenger who owned several restaurants in the San Francisco Bay area, and he told me that restaurants live and die by ratings. He said, “On Yelp, 4 stars is a really good rating.” However, he pointed out that in other cities, 4 starts might not be so good, “like Philadelphia,” where he also owned a restaurant.

In my poll, when I told passengers that a driver with 4.6 is likely to be de-activated, they were amazed. One passenger said, “Oh I feel so bad now. I almost always give drivers 4 stars – unless they do something really amazing.”

Nope, I said, “For a driver a 4 star rating is the kiss of death. It’s like you’re saying to the company, this driver should not be using our platform.”

Power Games

For drivers and passengers, the ability to rate is a form of power. Online message forums are filled with stories of drivers retaliating by giving someone a low rating, and last week I came upon this post on a board:

It actually instructs students on campuses to “blast” drivers by giving them a one star rating for no reason at all – among other things. Ratings are like a power trip to some people, but the ratings directly affect the livelihood of many people who depend upon this gig to make ends meet.

Want to plan your driving days better? Check out UZURV, an app that allows pax to make reservations with drivers ahead of time. With UZURV, you log on, accept trips that work with your schedule, and even build a relationship with your favorite pax. Check UZURV out here.

Bad Ratings for Things You Can’t Control

Often if a passenger is new to the platform, they will assume that the rating is about the company, not the driver. If they get hit with a heavy surge, they’ll sometimes take it out on the driver in the form of a low rating. If a driver gets stuck in a traffic jam, the passenger may feel it’s the fault of the driver.

They’ll then ding the driver because they think the driver should have known better. In UberPOOL and LyftLine, delays in picking up secondary passengers can get blamed on the driver. Even altercations between passengers in an UberPOOL sometimes cause low ratings. None of these things are matters under the control of the driver.

No Opportunity to Challenge a Low Rating

One time I gave a Lyft to a woman in San Francisco who left her cellphone in the car. It was my last ride of the night and immediately after dropping her off, I turned off my cellphone and went home. I didn’t get the message about her lost cellphone until the next morning.

When I awoke, I also noticed that my ratings had plunged overnight. I knew it had to be her, because it was the only ride I’d given on the Lyft platform that night. She apparently had gone into such a panic about her cellphone, and the fact that I wasn’t responding that she dinged me. I contacted Lyft support, explained the situation and asked to have the poor rating removed. “Sorry, we can’t do that,” I was told.

No Feedback

The rating systems are designed in such a way to mask the source of a low rating. One can certainly understand why Uber / Lyft might want to protect passengers from possible retaliation, but by doing so they preclude the possibility of providing the driver with feedback that might help them improve as a driver. Compliments really don’t help there.

What would be useful would be specific comments on what you might have done wrong – and right. Lyft used to include a copy of the comment by the passenger and give you an opportunity to respond. They no longer include the comment. They give you something vague and ask you to respond – often without specifying what you’ve been accused of. Uber’s approach is even more opaque.

A Fiercely Competitive Corporate Culture

The harshness of the rating system is no accident. For the longest time I figured this kind of fear based motivational system was directed primarily at drivers, but with recent revelations from Uber employees, we now know this fiercely competitive attitude is baked into their corporate culture. Here’s a quote from a recent article in Quartz that discusses this issue.

In interviews with Quartz, half a dozen current and former employees described the company’s internal review process as competitive, very unfair, and a black box. Uber managers rank their employees twice a year on a scale from one (low) to five (high). Three is average, five so rare that it’s reserved for Jesus or Travis, one former employee joked.

Employees who receive twos or ones are considered underperformers and placed on performance improvement plans, which also serve to warn that they could be fired or let go down the line. There are other consequences to a low rating. Being an underperformer means not being able to transfer teams, multiple people told Quartz, a plight also described by Fowler in her blog post. Ratings also affect employee bonuses, which are granted largely in equity. If you get a very high performance rating, that is a way to do very well, a former employee said. Uber declined to comment for this story.

It’s understandable why Uber would want to foster a culture of fierce competitiveness. Ride-hailing is a cutthroat and unforgiving industry that many believe has room for only one winner. As the most highly valued startup in the world, at $68 billion, Uber is under a tremendous amount of pressure from investors to produce results and has little if any margin for error. That its employees are held to similarly exacting standards should hardly be surprising.

It’s All About Perception

After driving for three years and writing this weekly roundup for almost two years, I’ve learned one thing about Uber (and Lyft) – it’s all about perception. They’ve dubbed 2017 the year of the driver to create the perception that the company cares about drivers. So too it is with ratings. If a passenger gets a 4.8 rated driver, that sure seems a lot better than if they get a 3.8 driver. The rating system is all about trust.

The guys who started this whole rideshare thing recognized this at the beginning. They had to address one fundamental question – how are we going to create sufficient trust to convince people to get into cars with complete strangers? When you think about that, it’s pretty darn amazing that people do that now without a second thought. The reason is simple: the rating system creates the perception that the platform is safe, for drivers and passengers. And if both are in the habit of giving the other 5 star ratings, so much the better. But that still doesn’t make the system fair.

Uber vs Lyft Ratings System

There are minor differences between the Uber and Lyft rating systems but they’re essentially the same. Both are five star systems, and both put the passenger in the dominant position. The big difference is that Uber averages your last 500 rides, while Lyft only includes your 100 most recent rides.

In effect, the Uber approach gives drivers marginally more peace of mind. It’s not quite tenure as a driver, but at least you know that one or two one star ratings won’t break you. But apparently that’s about to change (in some cities at least).

According to Christian Perea, my colleague at RSG, Uber is testing a new approach, using (like Lyft) the last 100 rides in certain markets. Christian writes, “they will probably test this at first in a few markets before deploying it elsewhere. Using the rolling 100 instead of 500 tends to be more favorable on ratings in my experience, and Uber has an interest in having drivers with ‘higher’ ratings, so my bet is that this gets rolled out pretty quickly.”

Why Ratings in the First Place?

There are several reasons for ratings, but I’m not sure any of them are truly persuasive. Yes, ratings do put a kind of fear in the drivers and keeps them in line. That’s the intention, but is that fear factor really necessary as a motivational tool? Do we have any examples of systems that don’t use ratings? What would happen without ratings? Would Uber’s level of service suddenly devolve, a downward spiral ultimately bottoming out somewhere near taxis?

There is at least one example of a system that doesn’t use ratings and has done reasonably well. When Uber/Lyft pulled out of Austin, several startups filled the void, including Ride Austin. It’s doing pretty well. It doesn’t use ratings, and it works just fine. According to one driver in Austin, he no longer stresses about whether his Uber rating is high enough, or if he’ll have enough money for his three kids. He likes the nonprofit aspect of Ride Austin, the service he spends most of his time driving for.

What Can Uber / Lyft do to Improve the Current Rating System?

With all the negative press of late, Uber appears to be in soul searching mode, and this just might be the right time to re-consider ratings. With minimal investment, they could curry favor with drivers by fixing a seriously flawed system:

1. Provide more feedback as part of the rating system.

2. Educate passengers – it’s a simple matter. On the screen that appears before the five star rating option comes up, they could have a message to the effect: If a driver average rating is 4.6 or lower, they are in danger of being deactivated or If you rate a passenger 4 stars or lower, you are making a statement this driver should not be driving on our platform.

3. Adjust the algorithms to factor in a passenger’s tendencies on ratings. This is potentially the most impactful thing Uber can do to make the system more fair. I got the ideas for this on a recent ride I gave to a data engineer in San Francisco.

We were talking about the rating system and then he asked how the system might be improved. I said, “Uber has the data on passenger ratings. Let’s say they keep a running average of all the ratings a passenger has given drivers and one passenger comes out at 4.0. Would it be possible to factor that into the affect that passenger’s bias on ratings into the driver’s rating?”

“Easy.” he said. “Just include the differential from the mean …”

“Explain.” I said.

“Okay: if a particular passenger is in the habit of rating drivers 4.0 and they rate a particular driver a 4.0, then that has no affect on the driver’s overall rating average. If that same passenger rates a driver a 5.0, then the driver gets a +1 boost. If they give the driver a 3.0, they get a -1 detraction and so on.”

“Genius,” I said, “That way the system automatically knows which passengers typically rate harshly, and accounts for it, automatically.”

Yes,” he said, “It would also give greater weight to a passenger who almost always rates drivers 5.0 and suddenly gives a 4 or a 3.

“Would that be easy to do?” I asked.

“Really easy, he replied, “Piece of cake.”

Yep, I thought, that way Uber could eat their cake and have it too.

Readers, what do you think about these ideas for improving the rating system? What suggestions do you have for Uber and Lyft to improve their ratings systems, or do you think the system is fair as is?

-John @ RSG