With 102 glorious golden medals lined up on board for 15 sports tailgated by blissful hues and athletes melded in the heart of Pyeongchang to sweep off the podium. Norway tasted snow embellished triumph, as the Norwegian Olympic Federation ran out of commemorative shoes and cakes for their champions. A minuscule country buttressing a small population is a dynamite for winter Olympics, as this time they have surpassed the medal wiping USA
It will be interesting to know whether, the Game of Snow of top 5 countries revolves around adrenaline and uncertainty or they undergo specialization??? Now let’s walk through and take a glance on past data of Winter Olympics from Chamonix to PyeongChang.
Scatter plot with the help of ggvis package gave a stepping stone for deeper insights on performance.
Here classification algorithms like K-means clustering and Nearest Neighbour Algorithm are used to analyse the specialisation pattern among countries across games.
It is an unsupervised learning algorithm; which means there is no outcome to be predicted but to analyse the pattern based on their similarity. Using this algorithm help to cluster the winter data in accord with medal counts.
K-NEAREST NEIGHBOUR ALGORITHM
It is a simple and lucid machine learning algorithm which label the data based on the similarity in distances ie; Euclidean distance or Manhattan distance. The confusion matrix obtained from the aforementioned algorithm is shown below:
Confusion Matrix and Statistics test_prediction AUSTRIA CANADA GERMANY NORWAY UNITED STATES AUSTRIA 7 1 0 0 0 CANADA 0 6 0 0 2 GERMANY 1 1 6 0 0 NORWAY 1 1 0 6 0 UNITED STATES 1 2 0 0 5 Overall Statistics Accuracy : 0.75 95% CI : (0.588, 0.8731) No Information Rate : 0.275 P-Value [Acc> NIR] : 5.851e-10 Kappa : 0.6875 Mcnemar's Test P-Value : NA Statistics by Class: Class: AUSTRIA CANADA GERMANY NORWAY UNITED STATES Sensitivity 0.7000 0.5455 1.0000 1.0000 0.7143 Specificity 0.9667 0.9310 0.9412 0.9412 0.9091 PosPred Value 0.8750 0.7500 0.7500 0.7500 0.6250 NegPred Value 0.9062 0.8438 1.0000 1.0000 0.9375 Prevalence 0.2500 0.2750 0.1500 0.1500 0.1750 Detection Rate 0.1750 0.1500 0.1500 0.1500 0.1250 Detection Prevalence 0.2000 0.2000 0.2000 0.2000 0.2000 Balanced Accuracy 0.8333 0.7382 0.9706 0.9706 0.8117
The model simulated has 75% accuracy. On considering other parameters on classification, the higher balanced accuracy of Norway and Germany is noticeable. It emphasises their enhanced chance of winning a particular game.On considering other parameters on classification, the higher balanced accuracy of Norway and Germany is noticeable. It emphasises their enhanced chance of winning a particular game.
From this analysis its vividly identifiable that the K Nearest Neighbour algorithm gives better results than K means algorithm.
The following table shows the sensitivity of the specialisation of games in 5 countries.
This article is written by Anjali UJ and Shabin Nahab