How to Build a 5 Star Rating System with Wilson Interval in MySQL

How to Build a 5 star Rating System with Wilson Interval in MySQL

In this article I’m going to explain how to build a 5 star rating system with MySQL. We will be using triggers to make the queries lightweight and fast. Triggers are supported in MySQL since version 5.0.2 so make sure you’re up to date. I will not explain how to implement the results in PHP, Ruby or whatever floats your boat. You can find plenty of tutorials on the web about that so no need to write another article. This article is based on Evan Miller’s tutorial ‘How Not To Sort By Average Rating‘.

The Problem

There are a lot of rating systems available and many of them are wrong. The majority of rating tutorials you’ll find on the web take the average rating, without taking the total number of votes into account. But it won’t stop there. Average rating systems can be found anywhere on the web. Even on big websites like Amazon. So why is it wrong?

Suppose you have 10 products on your website. Product 1 has a total of 10 votes, where 9 are positive and 1 is negative. With an average five star rating system that rating would be (9 / 10) * 5 = 4.5 stars. Now you have another product with 1 vote, which is positive. That would mean the product would have (1 / 1) * 5 = stars. So product 2 would rank higher than product 1, even though product 1 has 9 positive votes and only 1 negative vote.

The Solution

The solution to this problem is normal approximation interval. Wikipedia tells us:

“The most commonly used formula for a binomial confidence interval relies on approximating the distribution of error about a binomially-distributed observation, with a normal distribution… The Wilson interval is an improvement (the actual coverage probability is closer to the nominal value) over the normal approximation interval and was first developed by Edwin Bidwell Wilson (1927)”

Say what? In case you’re interested, Edwin Bidwell Wilson came up with a formula that would solve all our problems and it looks like this:

Wilson Score Interval Formula

Evan Miller explains in his article: “Score = Lower bound of Wilson score confidence interval for a Bernoulli parameter. We need to balance the proportion of positive ratings with the uncertainty of a small number of observations.”

Now, we don’t need to dig deeper in this matter because Evan Miller came up with this SQL query:

      ((positive + 1.9208) / (positive + negative) -
      1.96 * SQRT((positive * negative) / (positive + negative) + 0.9604) /
      (positive + negative)) / (1 + 3.8416 / (positive + negative))
AS ci_lower_bound FROM widgets WHERE positive + negative > 0
ORDER BY ci_lower_bound DESC;

What this query basically does, is give a rating a lower bound score. But let’s get back to our products. For example, with 9 positive votes and 1 negative vote this query would result in a lower bound of 0.595844. 1 positive and 0 negative votes would result in lower bound of 0.206543, which is lower. That means our first product would rank higher than our second and that’s exactly what we want. We’re going to use this query later in our triggers.


With a 5 star system there are more options than just positive or negative. Every time someone votes, 1 vote is added to the total votes. So we have to spread that vote over the positive and negative votes. Assume that 3 stars is not good and not bad either. So we take that as a middle. This means if someone votes 3 stars for our product, 0.5 goes to positive and 0.5 goes to negative.

Stars Negative Positive Total
0 0 0 0
1 1 0 1
2 0.75 0.25 1
3 0.5 0.5 1
4 0.25 0.75 1
5 0 1 1

Enough with the theory. We’re going to write the queries in SQL.

The Table

Now let’s create a table with the following columns: id, product_id, positive, negative, stars, total and lower_bound. In this example stars can have a half value. So 3,5 stars is possible. Total , product_id and id will be an integer, the others will be a float. We can achieve this with the following query:

CREATE TABLE `ratings` (
  `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `product_id` int(11) DEFAULT NULL,
  `positive` float NOT NULL DEFAULT 0,
  `negative` float NOT NULL DEFAULT 0,
  `stars` float DEFAULT 0,
  `total` int(11) NOT NULL,
  `lower_bound` float DEFAULT NULL,
  PRIMARY KEY (`id`)

Alright, done.


Triggers in MySQL are very useful when a defined action is executed for the table. The trigger can be executed when you run one of the following MySQL statements on the table: INSERT, UPDATE and DELETE. We want to calculate stars, total and lower_bound as soon as a rating is inserted or updated. Yeah, triggers are your friend! This is what the insert trigger looks like:

    TRIGGER `insert_rating` BEFORE INSERT ON `ratings`
           SET = new.positive + new.negative,
     new.stars = ROUND( (((new.positive / * 4) + 1) * 2, 0) / 2,
     new.lower_bound = ((new.positive + 1.9208) / (new.positive + new.negative) - 1.96 * SQRT((new.positive * new.negative) / (new.positive + new.negative) + 0.9604) / (new.positive + new.negative)) / (1 + 3.8416 / (new.positive + new.negative))

Every time a rating is inserted total, stars and lower_bound will be calculated. The stars are rounded to half decimals. Next is the update trigger which is almost the same as the insert trigger:

    TRIGGER `update_rating` BEFORE UPDATE ON `ratings`
      SET = new.positive + new.negative,
      new.stars = ROUND( (((new.positive / * 4) + 1) * 2, 0) / 2,
      new.lower_bound = ((new.positive + 1.9208) / (new.positive + new.negative) - 1.96 * SQRT((new.positive * new.negative) / (new.positive + new.negative) + 0.9604) / (new.positive + new.negative)) / (1 + 3.8416 / (new.positive + new.negative))

We have created the triggers and now we are ready to test them.

INSERT into ratings (product_id, positive, negative) VALUES (1, 0, 0)

You should see the calculated total, stars and lower_bound rows which are 0 or NULL. Now we’re going to add the first rating. Someone rated 4 stars for our product number 1. So we’re going to run the following query:

UPDATE ratings SET negative = negative + 0.25, positive = positive + 0.75 WHERE id

And last but not least, to get the list of most popular products:

SELECT * FROM ratings ORDER BY lower_bound DESC


That’s it for this tutorial. Now you have a rating table in SQL with the Wilson score confidence interval. But there are a lot of ways to extend this tutorial. You might want to store the IP addresses in another table to make sure people only vote once.

Guido loves music, traveling, software development and tomato soup. He is currently working on a cool project called Spottocamp which is expected to go live in February. Stay tuned!

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.


Iconfinder Coupon Code and Review

Iconfinder offers over 1.5 million beautiful icons for creative professionals to use in websites, apps, and printed publications. Whatever your project, you’re sure to find an icon or icon…

WP Engine Coupon

Considered by many to be the best managed hosting for WordPress out there, WP Engine offers superior technology and customer support in order to keep your WordPress sites secure…

InMotion Hosting Coupon Code

InMotion Hosting has been a top rated CNET hosting company for over 14 years so you know you’ll be getting good service and won’t be risking your hosting company…

SiteGround Coupon: 60% OFF

SiteGround offers a number of hosting solutions and services for including shared hosting, cloud hosting, dedicated servers, reseller hosting, enterprise hosting, and WordPress and Joomla specific hosting.