How It Works

Flat Average

As stated in the introduction, Movie Hurl is a system for predicting which movies will give you motion-sickness (or movie-sickness). The simplest way this could be done would be with a straight-forward survey in which people assign a score to a given movie (Movie Hurl uses a number between one and four), and that movie's rating, or prediction, is simply the average of the scores assigned to it. However, this isn't necessarily a very useful prediction. For example, if Alice knows in advance that she is prone to movie-sickness and knows that Bob never gets sick in movies, then it doesn't mean much to her if Bob tells her he didn't get sick watching a movie. She still fears that she may get sick regardless of Bob's safe prediction.

Weighted Average, Movie Hurl's Generic Rating

The solution is to use a weighted average instead of a flat average. Every person rates a movie in the range one to four, but also provides a personal susceptibility to getting sick in the first place, also in the range one to four. Alice asks Bob and Carl if they got sick watching a movie. Bob never gets sick so rates the movie a one (it didn't make him sick) along with a susceptibility of one (stating that he doesn't tend to get sick in movies in the first place). Carl is just like Alice though. He gets sick in movies all the time. He rates the movie with a four, and also states his susceptibility as a four. Without any weights, the average rating for the movie is clearly (1+4)/2=2.5. However, using the personal susceptibilities as weights the rating becomes (1x1+4x4)/5=3.4. This weighted average is a better indicator to Alice of her likely response to the movie. Movie Hurl's generic rating for a movie works exactly like this. You will see a rating between one and four stars (rounded to the nearest integer). You will also see a small number in parentheses accompanying each generic rating. This number indicates the number of people who have provided a generic rating for the movie. This number is useful because it provides some indication of the likely variance (precision) of the prediction. Bottom line, if the number in parentheses is high, you can be more certain that the generic rating accurately describes the weighted tendency of the movie to make people sick.

There are several problems with generic ratings, all of which stem from the fact that they are completely unregulated. For example, maybe Bob really wants his rating to count and confuses the weight associated with the susceptibility as an indication of his personal ego or importance instead of as a simple statistical indicator. Bob might then state a higher susceptibility than he really should. In fact, one can imagine that all susceptibilities might converge heavily on four for this reason, at which point the weighted average is completely negated and we are back to a flat average again. Or consider a second possiblity, that Bob might rate the same movie twice, either out of intentional fraud or by honestly forgetting that he previously rated a given movie.

These are both serious problems and Movie Hurl really can't prevent them from occurring on the generic ratings (personal ratings don't suffer from these problems, as described below). Therefore, generic ratings are of limited usefulness. They might provide some vague indication of a movie's tendency to cause movie-sickness, but they also might be pretty messed up by these problems.

Generic Rating Matrix

In addition to the weighted average generic rating, Movie Hurl also provides a generic rating matrix which shows a histogram of the ratings a movie has received relative to various people's susceptibilities to movie-sickness. This matrix is explained in detail at the bottom of the legend.

Person-to-Person Difference, Movie Hurl's Personalized Rating

Personalized ratings solve most of the problems associated with generic ratings by keeping track of an individual's rating history and then finding differences between one user's ratings and another's. There is no need for a person to enter a susceptibility to use as a weight, and therefore such a parameter is unavailable for fraud. In addition, by tracking an individual's history, multiple ratings for a single movie can be avoided.

Movie Hurl produces personalized ratings by using a huge matrix, with users labeled across the rows and columns, and with cells of the matrix storing the average rating difference between pairs of users for any given movie. For example, let's assume Alice, Bob, and Carl have all seen and rated Transsiberian. Alice gave the movie a two meaning it made her a little sick, Bob gave it a one meaning he had no problem, and Carl gave it a three meaning he got pretty sick. Their matrix looks like this:

	Alice	Bob	Carl
Alice		-1	1
Bob	1		2
Carl	-1	-2

To find Bob's rating from Alice's rating, read across Alice's row to Bob's column. Add the value found in that cell to Alice's rating to get Bob's rating. Alice gave the movie a two. Find her row and read across to Bob. That cell has a negative one. Add negative one to Alice's rating of two and you get one, which is Bob's rating.

If a pair of users has rated more than one movie, then the value in the matrix is their average difference in rating across all the movies they have both rated. The matrix now permits us to predict a rating for a movie that Alice hasn't rated yet. If Bob has rated The Constant Gardener and Alice hasn't, we can now predict Alice's reaction to that movie by applying the matrix to Bob's rating for The Constant Gardener.

To get the best possible prediction, we shouldn't use just one user to predict the rating for another user. We should use as many users as possible. If both Bob and Carl have already rated The Constant Gardener, and they both have some previous ratings in common with Alice so a correlation has been established between both of them and Alice, then we should try to use both of their ratings, in conjunction with the matrix, to predict Alice's rating. The simplest approach is to find the individual predictions based on the matrix and average them, but we can do even better than that.

What if Alice and Bob have rated more movies in common than Alice and Carl? In that case, we should trust the prediction based on Bob's rating and his matrix cell more than Carl's. Just like the generic rating, the solution here is to use a weighted average. In this case, the weight will be the number of movies a given pair of users has rated previously. If Bob and Alice have previously rated two movies in common, and Carl and Alice have previously rated one, then Alice's prediction will be weighted toward Bob by a factor of two and toward Carl by a factor one. This method describes how Movie Hurl's personalized ratings works. [Note, for some slightly complicated reasons, it might be better to use the square root of the number of movies a pair of users has in common as the weight instead, but Movie Hurl does not currently do this, and I doubt it matters too much.].

As with generic ratings, you will see a small number in parentheses next to the personalized rating. This number indicates how many other users contributed to the prediction, i.e., of all the users who have ever rated a movie in common with Alice, how many of them also rated the movie for which Alice's prediction is being made. This serves the same purpose as in the generic rating, to provide some indication of the variance and overall trust-worthiness of the personalized rating.