In about 5th grade, American children are introduced to intermediate mathematical concepts and learn more than just addition, subtraction, multiplication, and division. This is when we first learn about the basic concepts of the “measures of central tendency” known as the mean, median, and mode.
These summary statistics have the ability to take a large amount of data and boil it all down to one number, trying to get as close to the “center” of the data as possible. It’s much easier to represent data using one number rather than trying to report on every data point in a large dataset. For example, perhaps your marketing team wants to showcase how the age of your customers differs from the competition, or that the deposits of customers tend to be higher for this product over another. To accomplish this, a single number is preferred for comparison’s sake. The most common type of calculation used in these cases is the average, also known as the mean.
But why is the average/mean the preferred metric? Unbeknownst to us, teaching curriculum standards created a subconscious, shared bias towards using the average even though it’s rarely the best way to represent large datasets. Without our consent, this is where the bias against using the Median began, and we, the Kasasa Analytics Research Team, are here to explain why the Median deserves another chance.
We’ll use characters to help keep the discussion from getting too abstract... or sleep-inducing.
Yet, the love affair with the Mean has spread throughout business articles and industry findings, all to the chagrin of data analysts. Countless claims start with the words “on average,” and few people ever really second-guess its authenticity. While, on the surface, the Mean seems like a natural front-runner for the “Most Likely to Succeed” title, it can actually be quite flawed, misleading, and easily swayed by outliers. It’s not so perfect after all and can even be a bit of a bully, pushing other more useful metrics out of the spotlight.
Maybe it’s time to consider our other options.
But, if you focus on its real value, you’ll see that the Median is in perfect balance — exactly one half of the data is on one side of the Median and the other half is on the other side. It’s never dragged down by outlying data points in the same way as the Mean; in fact, the Median best summarizes datasets that are negatively or positively skewed.
The more skewed the distribution, the greater the difference between the Median and the Mean, and the more valuable the Median becomes. And when the data is normally distributed, or symmetrical, the Mean begins to mimic the Median! (So does the Mode, but that guy’s weird.) The Median stays honest and true to the data, no matter how unbalanced the other two guys get.
This guy seems legit.
We challenge you to make the Median the norm in your workplace. It may feel cumbersome to start using the Median in your presentations, especially since it may require a lengthy explanation. But we at least wanted to explain to you why data analysts trust the Median, and how it differs from other numbers you are more familiar with. You can start your campaign to bring back the Median by sharing this article with your co-workers!
We do sincerely hope we’ve convinced you not to overlook the Median. It’s got a lot to offer and is eager to help out… if you’ll just give it a chance.
Note: Icons created by Ester Barbato from noun project