Evaluation of team rank prediction in sports league by sum of residuals

J1_2019_pred_results<-read_csv(file.path("2019-12-29-evaluation-of-team-rank-prediction-in-sports-league-by-sum-of-residuals_files","日刊スポーツ・サッカー担当の19年J1順位予想と結果.csv"))
## Parsed with column specification:
## cols(
##   rank = col_double(),
##   Hamamoto = col_character(),
##   No = col_character(),
##   Hosaka = col_character(),
##   Shimoda = col_character(),
##   Sugiyama = col_character(),
##   Kinoshita = col_character(),
##   Okazaki = col_character(),
##   Matsuo = col_character(),
##   Iwata = col_character(),
##   Kamiya = col_character(),
##   Maeda = col_character(),
##   Sanefuji = col_character(),
##   Kikukawa = col_character(),
##   Ishikawa = col_character(),
##   Uehara = col_character(),
##   Ogishima = col_character(),
##   Results = col_character()
## )
#J1_2019_pred_results %>% DT::datatable()
J1_2019_pred_results %>% knitr::kable()
rank Hamamoto No Hosaka Shimoda Sugiyama Kinoshita Okazaki Matsuo Iwata Kamiya Maeda Sanefuji Kikukawa Ishikawa Uehara Ogishima Results
1 Kashima Kashima Kawasaki_F Kawasaki_F Kashima Urawa Urawa Kawasaki_F Urawa Kashima Kawasaki_F Urawa Kawasaki_F Kawasaki_F Kawasaki_F Kawasaki_F Yokohama
2 Kawasaki_F Tokyo Kashima Kashima Kawasaki_F Kawasaki_F Kawasaki_F Kashima Kawasaki_F Urawa Kashima Kobe Kobe Kobe Kobe Kashima Tokyo
3 Kobe Kawasaki_F Sapporo C_Osaka Urawa Tokyo Tokyo Nagoya Kashima G_Osaka G_Osaka Kashima Tokyo G_Osaka Nagoya Kobe Kashima
4 Urawa Kobe Urawa Sapporo G_Osaka C_Osaka G_Osaka Urawa Kobe Kawasaki_F Kobe Kawasaki_F Kashima Tokyo Sapporo Urawa Kawasaki_F
5 G_Osaka Urawa Kobe Sendai Tokyo Kobe Sapporo Sapporo G_Osaka Shimizu Urawa C_Osaka G_Osaka Urawa Kashima G_Osaka C_Osaka
6 Tokyo Shonan G_Osaka Urawa Tosu Kashima Yokohama G_Osaka Nagoya C_Osaka Tokyo G_Osaka Sapporo Kashima G_Osaka Nagoya Hiroshima
7 Sapporo Nagoya Yokohama Kobe Sapporo Nagoya Kashima Kobe Sapporo Tokyo Nagoya Tokyo Urawa Nagoya Urawa Hiroshima G_Osaka
8 C_Osaka Hiroshima Tokyo Iwata Kobe G_Osaka Kobe Shonan C_Osaka Nagoya Iwata Yokohama Nagoya Hiroshima Tokyo Tokyo Kobe
9 Shonan G_Osaka C_Osaka Tokyo Sendai Sapporo Hiroshima Yokohama Tokyo Hiroshima Sapporo Sapporo Shimizu Sapporo Tosu Iwata Ooita
10 Shimizu Yokohama Shonan Ooita Hiroshima Hiroshima Nagoya Tosu Iwata Iwata Yokohama Nagoya Hiroshima Sendai Hiroshima Sapporo Sapporo
11 Sendai C_Osaka Iwata Shonan Shimizu Tosu Iwata Sendai Tosu Kobe Shimizu Hiroshima Sendai C_Osaka C_Osaka Shimizu Sendai
12 Hiroshima Sapporo Shimizu Yokohama Yokohama Sendai Shimizu Hiroshima Shimizu Sendai C_Osaka Shimizu Shonan Yokohama Yokohama Yokohama Shimizu
13 Yokohama Iwata Hiroshima Shimizu Shonan Shonan Shonan Tokyo Shonan Yokohama Hiroshima Sendai Yokohama Shimizu Shonan C_Osaka Nagoya
14 Iwata Tosu Nagoya Hiroshima C_Osaka Shimizu C_Osaka C_Osaka Yokohama Tosu Tosu Tosu C_Osaka Shonan Sendai Shonan Urawa
15 Nagoya Matsumoto Tosu G_Osaka Matsumoto Iwata Sendai Ooita Hiroshima Matsumoto Sendai Iwata Iwata Tosu Iwata Sendai Tosu
16 Tosu Sendai Sendai Nagoya Iwata Yokohama Matsumoto Shimizu Sendai Sapporo Shonan Shonan Tosu Matsumoto Shimizu Tosu Shonan
17 Matsumoto Shimizu Ooita Tosu Nagoya Matsumoto Tosu Iwata Matsumoto Shonan Matsumoto Matsumoto Ooita Iwata Matsumoto Matsumoto Matsumoto
18 Ooita Ooita Matsumoto Matsumoto Ooita Ooita Ooita Matsumoto Ooita Ooita Ooita Ooita Matsumoto Ooita Ooita Ooita Iwata

Japanese professional soccor league is called “J league”" and its top level is J1. Juset before season starts sports media predict team rank. It is unusual the media review their prediction after finishing the season, but Nikkansports.com did this year (Table above). In that article their evaluation of the prediction done by 17 writers is rough and not sufficiently analytical, so that I tried to analyze the data here for fun. No writers predicted that Yokohama won this season. I am just curious who’s prediction is very close to the result and which temas did surprised performance.

convert data format

Results<-J1_2019_pred_results %>% select(rank,Results) %>% rename(team=Results,results=rank)
J1_2019_pred_results.mod<-J1_2019_pred_results %>% select(-Results) %>% gather("person","team",-1) %>% rename(prediction=rank) %>% left_join(Results,by="team")

Who is the best predictor?

J1_2019_pred_results.mod %>% mutate(var=(prediction-results)^2) %>% group_by(person) %>% summarise(`sum of residuals`=sum(var)) %>% arrange(`sum of residuals`)
## # A tibble: 16 x 2
##    person    `sum of residuals`
##    <chr>                  <dbl>
##  1 Sanefuji                 410
##  2 Ishikawa                 442
##  3 Hosaka                   482
##  4 Kikukawa                 482
##  5 Hamamoto                 506
##  6 Okazaki                  508
##  7 No                       528
##  8 Maeda                    554
##  9 Sugiyama                 574
## 10 Shimoda                  590
## 11 Kinoshita                592
## 12 Ogishima                 594
## 13 Uehara                   604
## 14 Kamiya                   614
## 15 Matsuo                   682
## 16 Iwata                    754

Smaller sum of resuduals means there are less differences beteen their prediction and result. Congratulations Mr. Sanefuji (correct pronounciation?).

Which teams are surprise?

J1_2019_pred_results.mod %>% mutate(var=(prediction-results)^2) %>% group_by(team) %>% summarise(`sum of residuals`=sum(var)) %>% arrange(desc(`sum of residuals`))
## # A tibble: 18 x 2
##    team       `sum of residuals`
##    <chr>                   <dbl>
##  1 Yokohama                 1767
##  2 Urawa                    1746
##  3 Ooita                    1137
##  4 Iwata                     590
##  5 C_Osaka                   585
##  6 Nagoya                    540
##  7 Hiroshima                 431
##  8 Tokyo                     425
##  9 Shonan                    344
## 10 Kobe                      271
## 11 Sapporo                   258
## 12 Tosu                      189
## 13 Sendai                    179
## 14 G_Osaka                   155
## 15 Shimizu                   128
## 16 Kawasaki_F                 93
## 17 Kashima                    60
## 18 Matsumoto                  18

Across teams I calculated the sum of residuals. In this analysis larger value means more surprise for those writers. Yokohama F. Marinos gained the largest value, which means writers underestimate Yokohama’s performance. On the other hand the second largest value of Urwawa has different meaning. Although Urwa Red Diamonds became a finalist in 2019 AFC (Asian Football Confederation) Champions League, their performance was so bad that the team head coach had to quit in the middle of the season. Kashima Antlers was the winner of 2018 AFC Champions League Final, so that many writers predicted its very good performance. As expected Kashima did good job (3rd place) and its sum of resisuals are very small. Unfortunately Matsumoto Yamaga’s performance was as expected in a bad meaning The team is too small to be competitive in the top level league. I hope this team in a small town has learned a lot from this year’s experiences.

Conclusion

There could be well established method for this kind of analysis, but I am satisfied with my rough analysis using sum of residuals.