In his famous book ‘Outliers’, Malcolm Gladwell exposes a rather disturbing fact about the Hockey professional world: according to him, babies who are born in the earlier months of the year (i.e. January, February or March) are more likely to reach the top divisions than those who were conceived in the colder months. However convincing the author’s subsequent development of the reasons underpinning this surprising finding, the hard and cold data offered to the reader is too restricted in size for anyone to feel confident in condoning his claims. Indeed, Gladwell only presents birthday data for a single team, the Medecine Hat Tigers, during a single year: 2007.
Because I have always valued Gladwell's work, I wanted to verify whether his assertion would withhold the test of big data. I thus went ahead and collected from the web - specifically from the website hockey-reference.com - 7411 observations about the birth date of as many Hockey players from the Canadian league. I then formed a data frame of the 366 days of the year linked to the corresponding number of Hockey babies born that day. Once I had gathered and formatted all that data, I was able to produce the following visualizations:
The results are quite stunning: the number of hockey players born each day dramatically decreases as we progress down the x-axis. January and December are respectively the months with the highest and lowest number of births: 765 babies were scheduled along with the New Year while only 488 babies were greedily delivered by Santa Claus.
It thus seems that Malcolm Gladwell had a good flair on the issue at hand : we can conclusively say that if you happen to be Canadian and want your baby to go on to become a great hockey player, you are probably better off skipping Valentine's day celebrations and reserving your passions for Easter.























