Individual Goals Prevented

Lately, I've been trying to come up with a defensive equivalent to the goals created statistic. Ideally, I would like to be able to add the two, and come up with an overall value of each player. I'm not quite there yet, but I've found something that makes sense for defensive contribution. When looking at Marginal Goals, we can determine how many goals (compared to a fictional "marginal" team) we prevented. With the percentage of shots saved by the goalkeepers, we can break it down further to see how many goals were prevented by the keeper, and how many were prevented by outfield players.


In the 06/07 season, I rated our outfield defence 15.768 goals better than the marginal team (think Sunderland of 05/06). How do we divide this up among individuals? The most fair method I've come up with this equation:


Here, we have MGP for marginal goals prevented; DA as defensive actions - including all clearances, blocks, interceptions, tackles, and offside provocations; and M as minuses - the number of goals conceded with a player on the pitch. The same variables with a lower-case t are the team totals for each. This gives us the following table for 06/07:

Name Marginal Goals Prevented
Hughes 2.759
Bocanegra 1.820
Stefanovic 1.761
Hangeland 1.744
Baird 1.344
Konchesky 1.327
Stalteri 0.704
Davies 0.640
Murphy 0.628
Dempsey 0.622
Davis 0.622
Andreasen 0.376
Knight 0.358
Volz 0.211
Bullard 0.182
Smertin 0.176
Bouazza 0.170
Seol 0.141
Nevland 0.059
McBride 0.047
Kuqi 0.041
Ashton 0.041
Diop 0.012
Kamara 0.006
Johnson 0.006
Pearce 0.006
Christanval 0.006
John 0.000
Healy -0.041

So what this means is that, by having Aaron Hughes start at CB instead of a generic player from the 05/06 Sunderland team, we prevented almost 3 goals. Brede Hangeland and Dejan Stefanovic still managed to prevent almost 2 goals apiece in their limited time. I think that all of this tells us that there isn't a huge difference between our top 4 CBs, but the difference does amount to about half a win, or 1-2 draws. As we have found out, that can be the difference between staying up and being relegated.


Another thing I found interesting is how this relates to our current central midfield dillema. I predicted that Danny Murphy and Jimmy Bullard will finish this season with almost exactly the same total of goals created, with roughly the same number of minutes. In defence, however, they differ quite a bit. Murphy, while not an idea defensive midfielder, was still ahead of Bullard. Leon Andreasen, projected over a full 3000-minute season, would have been more than a full goal better than Bullard. Would that goal make up for the loss in attacking ability? Or would Andreasen's presence allow Murphy and the rest of the team to attack more effectively? It's all difficult to say, but we're getting a little bit closer to understanding just what it means to replace a player in the lineup. The fun begins when we can look at marginal goals created. That's (hopefully) coming soon, and it should give attack and defence statistics a common ground, a way to say "Player A is better than Player B".

I'm struggling to understand

I'm struggling to understand your formula. Why do you divide tDM-tM by tMGP when you are trying to figure out what tMGP is in the first place? Is there a simpler way of stating that? Sorry I'm no mathematician.

Also, I reckon you should run Chelsea and Liverpool's numbers to get a look at Makallele and Carragher's stats respectively. Those two guys are such important players for there teams that you would imagine that they'd give you a good sense of the top end of the numbers players should shoot for.

The tMGP is something I

The tMGP is something I determined awhile ago, from team total of marginal goals, which was broken down into marginal goals created & prevented, with the prevented goals divided further into marginal goals prevented by outfield players and goalkeepers. The formula in this post estimates individual marginal goals prevented, and we divided by the team total so that the individual values always add up to the team total. Hope that makes sense (and that I'm doing everything correctly!). I'm no mathematician either, so I always happy to hear any comments, questions, suggestions, etc.

Ah I've gotcha. I thought

Ah I've gotcha. I thought that tMGP would be a factor of MGP. As MGP was what the equation is seeking I figured there must be a simpler way of calculating it. What happens when you substitute tMGP for the formula for tMGP in the equation? I'm wondering whether there's a simpler way of stating MGP proposition.

Nonetheless, these are pretty cool stats to have. As with your other ones, at some point the goal surely will be to have a complete set of hypotheses at the beginning of a season and see how well you're able to predict across a large sample size. Right?

Great stuff Colin - like the

Great stuff Colin - like the facelift too!

Really excellent blog, Colin.

Really excellent blog, Colin. I had often wondered whether anyone had tried to apply Jamesian principles to football statistics. Before reading your blog I'd never found anyone attempting it. There's undoubtedly some pioneering stuff here, rather puts my simplistic football stats blog to shame.

As a matter of interest where do you find stats such as clearances and offside provocations?

Guy, thanks for visiting.

Guy, thanks for visiting. Chopper linked me to your site yesterday, and I'll definitely be watching for updates. I think a lot of the work that has been done for baseball can be applied to football in some shape or form, though perhaps not always as meaningful. If we do find something useful, I'm happy.

Most of the stats come from the Telegraph's site. Their flash thing at the bottom of their reports has a ton of good info - I'd like to have data on the entire Premier League one day, once I figure out a better way of importing it.

Thanks Colin. I'm a big fan

Thanks Colin.

I'm a big fan of Baseball Prospectus and Football Outsiders, those were the sites which inspired me to try and examine soccer statistics more rigourously. There are clearly problems with applying their methods to soccer (wild inaccuracies in soccer data collection are a particular pain in the arse) but it's good to see people out there giving it a try, nonetheless. Despite the difficulties, I do think soccer is a sport ripe for a sabermetric revolution. It's slightly strange that where in baseball, outsiders (academics, fringe media) were the first to embrace sabermetrics, in soccer it's happening inside the clubs first. Some clubs anyway -- I think it's possible to spot those teams using more advanced metrics by the nature of their signings.

I use the Telegraph site but, as you say, it's very difficult to import the data, and it doesn't seem to be archived for more than one season. It's rather odd that companies like OPTA seem so reluctant to widely publish their stats, they're never going to popularise their approach by being so restrictive.

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.
  • Images can be added to this post.
  • You can embed tablemanager tables within your nodes using the following syntax:
    [tablemanager:table_id,pagination,admin_links,column=?|start=?|end=?,attribute=?|attribute=?|...]

More information about formatting options

CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.
18 + 0 =
Solve this simple math problem and enter the result. E.g. for 1+3, enter 4.