Global ETD Search

Return to search

Forecasting Success in the National Hockey League Using In-Game Statistics and Textual Data

In this thesis, we look at a number of methods to forecast success (winners and losers),
both of single games and playoff series (best-of-seven games) in the sport of ice hockey,
more specifically within the National Hockey League (NHL). Our findings indicate that
there exists a theoretical upper bound, which seems to hold true for all sports, that
makes prediction difficult.
In the first part of this thesis, we look at predicting success of individual games to
learn which of the two teams will win or lose. We use a number of traditional statistics
(published on the league’s website and used by the media) and performance metrics
(used by Internet hockey analysts; they are shown to have a much higher correlation with
success over the long term). Despite the demonstrated long term success of performance
metrics, it was the traditional statistics that had the most value to automatic game
prediction, allowing our model to achieve 59.8% accuracy.
We found it interesting that regardless of which features we used in our model, we
were not able to increase the accuracy much higher than 60%. We compared the observed
win% of teams in the NHL to many simulated leagues and found that there appears to
be a theoretical upper bound of approximately 62% for single game prediction in the
NHL.
As one game is difficult to predict, with a maximum of accuracy of 62%, then pre-
dicting a longer series of games must be easier. We looked at predicting the winner of
the best-of-seven series between two teams using over 30 features, both traditional and
advanced statistics, and found that we were able to increase our prediction accuracy to
almost 75%.
We then re-explored predicting single games with the use of pre-game textual reports
written by hockey experts from
http://www.NHL.com
using Bag-of-Word features and
sentiment analysis. We combined these features with the numerical data in a multi-layer
meta-classifiers and were able to increase the accuracy close to the upper bound

Machine learning

Hockey

Identifer	oai:union.ndltd.org:uottawa.ca/oai:ruor.uottawa.ca:10393/31553
Date	January 2014
Creators	Weissbock, Joshua
Contributors	Inkpen, Diana
Publisher	Université d'Ottawa / University of Ottawa
Source Sets	Université d’Ottawa
Language	English
Detected Language	English
Type	Thesis

Page generated in 0.0022 seconds

Forecasting Success in the National Hockey League Using In-Game Statistics and Textual Data

Description

Links & Downloads

Tags

Additional Fields