Yelp's Academic Dataset consists of 152,327 reviews of 6,900 businesses by 65,888 users. These businesses span 433 not-mutually exclusive categories, ranging from "Accessories" (52 companies) to "Zoos" (1 companies). A little background about the dataset: The information is primarily businesses near 30 schools across the United States.

Are you more likely to post a review on Yelp on certain days of the week?

The academic dataset pulled reviews from users that frequented restaurants near 30 US schools. Naively, you might attempt a frequency count of the days that reviews are posted, as shown:

Weekday of reviews

Frequency of weekdays for reviews in the Yelp Academic Dataset

This would suggest that users tend to post towards the beginning of the week, drop off by Saturday, before a slight uptick on Sunday. Whether or not this result is truly representative of the total population of Yelp reviewers is debatable. It's absolutely the accurate representation of this dataset, which are the users who reviewed these restaurants near a set of universities and colleges. This leaves us susceptible to many confounding variables, what if some locations are represented more than others? What about power-users (those who make dozens, if not hundreds of reviews)? In order to minimize these effects, we must take a random subset of the data.

Something interesting is revealed when you look at the total number of reviews made per user:

Total reviews for each user

Total number of reviews (across Yelp) for each user in the Academic Dataset. Most reviewers make a few reviews, a few reviewers make most reviews

Out of nearly 66,000 users, approximately 46,000 (or 70%) account for about 16.6% reviews. This is a representation of the so-called "80-20" for Pareto distributions: the vast majority of the total reviews fall into a small group of users. I can quickly confirm is adherence to power-laws with a simple log-log plot:

Total reviews per user

Log-log plot of total reviews per user confirms a power law distribution. Its scale-invariance cancels any attempt to use most statistical theory with the entire dataset

For power-law distributions, expectation values depend on sample size, throwing much of statistical theory out of the window. At this point, the best approach to answering this question is to break the data into segments:

  1. Occasional Reviewers: Reviews from users with a total review size less than 100 (~54,000 users)
  2. Power Reviewers: Reviews from users with a total review size greater than 500 (~900 users)
  3. Moderate Reviewers: Reviews from users with a total review size in between (~11,000)

We choose a sample size of 500 for our user segments. This was chosen to satisfy a 95% confidence level at a margin-of-error corresponding to 3, 10, and 30 reviews for occasional, moderate, and power reviews, respectively. Plotting the results for the days reveal the following distributions:

frequency of weekdays

Frequency of weekdays across each reviewer segment shows power users being particularly heavy Monday reviewers while occasional reviewers don't have an adherence to any particular day of the week

This visual suggests the following: power users of Yelp review most on Mondays, remain steady Tuesday through Thursday, and significantly drop off when the weekend rolls along. Moderator users review most Sunday through Tuesday. Occasional users remain relatively steady throughout the week.

How much of this is due to chance?

Let's establish the following null hypothesis:

There is no relationship between the day of the week and the day a user posts a review.

In other words, I will accept or reject the notion that P(Monday) = P(Tuesday) = P(Wednesday) = P(Thursday) = P(Friday) = P(Saturday) = P(Sunday) against the alternative hypothesis that (at least) two proportions are different. My test will be at the 5% significance level. For this question, I will only look at occasional and power users, to keep things simple.

Here is a table of the observed and expected values for the power user sample set:

  Observed (Power) Expected (Power)
Monday 756 583
Tuesday 597 583
Wednesday 607 583
Thursday 604 583
Friday 504 583
Saturday 482 583
Sunday 531 583

Observed and expected reviews of power users in sample set (size = 500, margin-of-error = 30 reviews)

The chi-square statistic for this test is 86.26. Looking at a standard table of critical values (with 6 degrees of freedom and an alpha of 0.05), we see that the chi-square critical value is 12.592. Because 86.26 > 12.592, we can safely reject the null hypothesis and conclude that the observed variation in reviews per day for power users unlikely happened by chance.

Here is a table of observed and expected values for occasional users:

  Observed (Occ.) Expected (Occ.)
Monday 139 128
Tuesday 153 128
Wednesday 122 128
Thursday 145 128
Friday 96 128
Saturday 121 128
Sunday 126 128

Observed and expected reviews of power users in sample set (size = 500, margin-of-error = 3 reviews

The chi-square statistic for this test is 16.78. Because 16.78 > 12.592, we can also reject the null hypothesis and conclude with 95% confidence that the observed variation in reviews per day for occasional users unlikely happened by chance.


Power users of Yelp (those who have made more than 500 reviews) are more likely to post their reviews on Mondays. There is a significant drop-off in reviews when the weekend rolls along, perhaps they are "collecting their data".

Occasional users of Yelp (those who have made less than 100 reviews) do not have a specific day that significantly favor (though there is a slight drop-off on Fridays). In other words, they post whenever is convenient for them, not being particularly driven to start their work-week with a round of reviews.


Data Munging

The data is delivered as a text file with one json-object per line, each containing an identifier attribute to make more complex joins ("What days are 3+ star reviews posted for Persian restaurants?"). I used the following modules in my script: json (to help parse the dataset), calendar (to convert the dates to specific days of the week), and matplotlib (for plotting).


comments powered by Disqus