Promoting the use of R in the NHS

Blog Article

This post was originally published on this site

(This article was first published on R – stats on the cloud, and kindly contributed to R-bloggers)

In an earlier post I blogged how I had made a Monte Carlo simulation model of the Six Nations Rugby Tournament.  With the final round of the tournament approaching this Saturday, I decided to do a quick update.

Who can win at this stage?
Wales, England, or Ireland can still win.  Scotland, France and Italy do not have enough points at this stage to win.  Quite a good article from the London Evening Standard explains the detail.  The current league table is below.

Actual standings after round 4 out of 5

Who is playing who in the final round?

What is the simulation model based upon?
A random sample from a probability mass function for tries, conversions and penalties, which is combined with a pwin for each team, calculated based on the RugbyPass Index for both home and away teams.  If you want to know more, feel free to look at my previous post (linked above) or the R script (linked at the bottom).

What does the simulated league table look after the final round?
Running a simulation for the final three games, and adding these results on to the actual points each team has achieved after round 4, we get the distribution of league points shown below.

Apologies: a box plot can be a bit odd for discrete data such as this.  Please forgive me!  If I had the time I would reform this into something like a stacked histogram which would be more accurate 🙂

It should be noted that, whilst the ‘standard’ scoring scheme applies for these final matches, i.e.

  • 4pt for a win, 2pt for a draw, 0pt for a loss.
  • plus 1 bonus pt for scoring 4 tries or more, regardless of win/lose/draw.
  • plus 1 bonus pt if a team has lost but by 7 game points or less.

…there are also 3 additional points awarded if a team wins the ‘Grand Slam’ (wins all of their matches).  The candidate for this is Wales only.  They have so far won every match, and if they win their final match they get these extra points to ensure they win the tournament.

This rule avoids the situations where a team could lose one match but obtain maximum bonus points in the other, finishing up with more points overall than a team that has won every match but never obtained any bonus points.

So then, what are the final standings likely to look like?
After having run a simulation of the final round, the results are below.

 

Wales are “firm favourites” to win the tournament.  England have a “reasonable chance”.  Ireland retain an “outside chance”.

How does all of this compare to expectations before the start of the tournament?


Ireland, England, and Wales were predicted to be in close contention.  Wales have outperformed the prediction (mainly due to beating England).  England have outperformed the prediction (mainly due to beating Ireland, and due to amassing a lot of bonus points).  Ireland have under performed against the prediction (mainly due to losing to England, and then narrowly missing out on bonus points: scored only 3 tries against Scotland; lost to England by only 12 game pt).

Scotland beat Italy with a bonus point victory, but they have only managed to pick up one bonus point in their other games.  Picking up points against England in their final match will be tough.  So they will be likely to under perform.  France will likely beat Italy and perform roughly as expected.  Italy are looking firm against the prediction of finishing bottom again this year (however imho they could be a team to watch in their final match, as they’ll be playing a presently disorientated France, at home in Rome).

It has been an interesting journey for me simulating sports tournaments over the past few months.  Monte Carlo approaches can help you see the wood from the trees in complex situations, which has applications not just in sport but in industry as well.

Maybe this has inspired you to have a go yourself?  If so, the code for this blog post is available via Git here.  Although if you wish to have a play or to adopt the code, the original version is much cleaner, available here.  Good luck!

To leave a comment for the author, please follow the link and comment on their blog: R – stats on the cloud.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more…

Comments are closed.