Three Points for a Win – Additional Data and Charts

In this section, you can find an overview to my calculations and download all of my data for free. For those curious about doing their own soccer analysis, don’t hesitate to use the data (think about giving me a shout-out on your blog, though!). If you have any questions or comments, feel free to email me at ncholst@uchicago.edu. I use Stata12 and Excel for all of my calculations. When I use Stata12, I am conducting simple linear regression analysis. This is useful for establishing a causal relationship between some independent variable (or variables) X, and an outcome, Y. You can find a good introduction to linear regression here.

1. Draws per games vs League Competitiveness:

Table1

Stata12 is a tool that lets us run a linear regression of draws per game on league competitiveness (the results of which are displayed here). The coefficient column tells us the relationship between the two variables, in this case DPG and League competitiveness. In this case, the coefficient is strongly negative, and is significant at a 94.2% level (the standard is 95%). Keep in mind that this is ultimately not a scientific study, and plenty of additional analysis (for example, controlling against unaffected cup matches) is required to reach particularly meaningful conclusions. Still, my results are encouraging.

2. Calculating League Competitiveness

In his paper Point Score Systems and Competitive Imbalance in Professional Soccer (2008)Haugen evaluated league competitiveness using the formula:

equation1

  • V = League variation defined on [0, 100]%
  • N = Number of teams in the league
  • d = points for drawing (in this case = 1)
  • MCPi = Maximal competitive point score =
  • APi = Actual point score for team in the final league table
  • = reward for winning a game (2 or 3 depending on the system)
  • LCP= Least (or minimal) competitive point score for team i =

The league variation V is defined as a ratio between a minimally competitive scenario and a perfectly competitive one. MCP is thus defined by a league table where all matches end in a draw – all teams are evenly matched. LCPi, the minimal competitive situation, implies a league table where the best team wins all matches, the second best wins all but two, and so on. Haugen determines team order by their final league placement (perhaps not the best gauge of team quality, but the most reliable one). Thus, a V value of 100% implies that AP = MCP (a maximally competitive league), and 0% suggests AP = LCP (a minimally competitive league). The index is not particularly suitable for telling us exactly how competitive a league is, but it is useful for comparisons and identifying trends over time. I then borrow his formula and make the following adjustments to gauge the competitiveness of the bottom-6 teams during the second half of the season.

  • i = 1, where 1 is the last team in the table rankings, 2 is the second last, etc.
  • V = League variation defined on [0, 100]%
  • N = 6
  • d = points for drawing (in this case = 1)
  • GL = games left to play
  • M = Mid-season point score
  • APi = Final point score – Mid-season point score for team i
  • MCPi = Maximal competitive point score =
  • LCP= Least (or minimal) competitive point score for team i =

This way, the maximally competitive point score represents a table where the second half of the season is entirely composed of draws. The minimal competitive scenario occurs when the team ranked last by mid-season is unable to win any more games, the team ranked second last is able to beat the last team, etc. To ensure reliability, I consider data from the top three divisions of English football. I then recalculate league point scores for teams during the 1975-1980 seasons under the 3PW system. This will help eliminate any bias caused by a naturally lower distribution of point scores before the change. This graph helps show why this is important: graph1 Under 2PW, the point distribution is going to be inherently lower than under 3PW by a factor of ~0.81. For example, take three teams that have won one, two, and three games respectively. Under the old system, they would have a point distribution of 2, 4, and 6. Under the new one, they would have 3, 6, and 9 points each. Because the old system has a lower distribution, it is considered more competitive. The change does incentivize winning games, but does so at the cost of league competition.

3. Calculating the impact of 3PW on a team’s comeback ability

Remember the question:  do teams facing relegation by mid-season get a second chance thanks to 3PW? To test this, we need to run a linear regression on bottom-six league competitiveness during the second half of the season against the first, using the introduction of 3PW as a control variable. This is a statistical tool that will let us isolate the effect of the rule change on bottom-6 competiveness during the second half of the season. For those acquainted with regression analysis, here are my results: table2 The P>|t| column represents the result of a T-test. In order to achieve reliable results, you want this value to be somewhere near 0.05, indicating a 95% confidence level. The fact that the values are over ten times that values tells us that neither first half league competitiveness nor 3PW are correlated in a significant way to the performance of bottom-six teams in the second half of the season. This test would be better run if there were an instrumental variable for predicting second half performance, but none occurs to me right now. If you have any ideas or are able to reach a different conclusion than me, please let me know at ncholst@uchicago.edu.

4. Other Cool Charts

Throughout this project, I made a bunch of charts that didn’t make the final cut in my analysis, but are still cool to check out. I won’t go into detail on them, but if you have any questions or comments, shoot me an email! graph2 These show my results for league competitiveness before and after I recalculate the 1976-1980 seasons under 3PW. Note that the unadjusted values are consistently higher. graph3 Note that league competitiveness has been decreasing since 1950. graph4 Rank difference is calculated by taking the average difference between teams’ mid-season league position and final league position. I take the square root of that to avoid giving too much weight to outliers. Note the decreasing trend, indicating that final league positions are increasingly decided by December. graph5 Draws per game are pretty consistent over time, although there has been a downwards trend beginning in 1995. Note that Spain is not that different from England. graph8 graph7 graph6

Here is more evidence that 3PW didn’t have much of an effect on bottom-six performance in the second half of the season. It is true that there is a temporary conversion of losses to wins among teams in England’s top flight of football, but this quickly rectifies itself. The other two leagues show no indication of changing in response to 3PW.

Advertisements

One thought on “Three Points for a Win – Additional Data and Charts

  1. Pingback: Why ‘Three points for a Win’ is a Loss for Football — a closer look into one of the most important rule changes in football history | Café Futebol

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s