This post is directed at a certain audience, but perhaps it'll be useful for others.
Corsi is becoming a bit more understood and mainstream these days, but there remains a lot of folks who are confused by it. This will be my attempt to clarify the stat. What it means and how it's useful in analysis.
What it is
Let's start with a simple definition. "Corsi" is the difference between all shots directed at net for and against at even strength. That is (shots+blocked shots+goals+missed shots FOR - SH+BLK+G+MS AGAINST). The purpose of the stat is to determine possession. It is, in fact, a proxy for "zone time". A positive corsi rate = more offensive zone time. Negative = more defensive zone time.
Here's an analogy that might help. Let's say a hockey game is a tug of war. Corsi is the how far right or left of center the rope is. On an individual level, it's an expression of which players are really pulling the rope. Therefore, if your team has a positive corsi rate, it means they are spending more time in the offensive zone at even stregth. It means they are pulling the rope harder than the opposition.
Why it's useful
In general, two things determine goals for (GF) and goals against (GA) in hockey: volume and frequency. Volume is the amount of shots a team generates and allows. Frequency is how often a team scores or allows goals on those shots. What we're learning in the NHL is that the former is far more repeatable and indicative of skill than the latter. Let's put it another way...
Goals are relatively random events in the game. On any given night, 60-80 pucks may be directed at the net at both ends. Maybe 5-10 goals will be scored. As a result, goals are statistically less powerful because the sample size is small. This means that randomness has a far greater influence. And what we've discovered at the NHL level is that percentages (SH% and SV%) tend to regress to the mean over the long term. As a result, a team that is winning via high frequencies is said to be "riding the percentages" and their success is probably based on randomness or "luck".
Another example. We all know that the chances of a flipped coin landing on heads is 50%. However, it's entirely possible that a coin will land on heads 7 or 8 times in a ten flip sample. This is not indicative of a special coin or special "coin flipping skill". It's variance. As such, we can say with confidence that over, say, 1000 flips, we'll get back down to the 50-50 split.
Volume, or outshooting (corsi) is far more powerful statistically, however, and therefore less skewed by randomness. So, whereas percentages tend to regress to the mean, outshooting is far more stable and therefore indicative of a team's (or players) abilities. The evidence of corsi's value is being investigated by smarter men than me these days, but the evidence continues to pile up. Corsi correlates strongly with scoring chances. It also correlates highly with outscoring (0.65) over the course of the season. From the latter link, JLikens explains that outshooting explains 40% of the variance in EV scoring. Almost half. That's regardless of of things like goaltending ability or the percentage of shots a team has blocked versus what they get on net. It also excludes randomness as we discussed above.
Corsi is a long range stat. A team can outshoot the bad guys in a single game or even a series of games and still lose. The hockey gods can be arbitrary. But, eventually, outshooting teams will win more than they lose. And the more time they spend in the offensive zone, the better they are, the more they'll win.
Evaluating individual players with corsi is a little trickier, because circumstances can elevate or sink skaters, depending. The checking center or shut down defender who starts every shift in his own zone against superstars is bound to have a lousy rate, for example. But that's probably a discussion for another time.
I hope this helped clarify things.
8 comments:
Nice article Kent. But, um, r^2 = 0.55!
Seriously, I despise the argument (popularized by JibbleScribbits) that if it doesn't correlate perfectly then fuck it.
Of course these brilliant minds fail to address the fact that current outscoring's correlation to future outscoring is shit fuckin' all, over the time span of one season (so where the fuck's that r^2 now bitches!). And in a completely unexpected twist, most of those who ppf-shaw at Corsi are fans of teams that are winning without outplaying (Colorado I'm looking at you Jibble, Montreal, etc.)
The whole point of this exercise, which you capture nicely here, is to get predictive value. What is sustainable over the course of one season in hockey and what is not.
If the season were five times as long as it is now then we probably wouldn't even be talking about Corsi since outscoring would shed a lot of its noise.
Addendum: since variance is basically noise^2, lower values of r^2 are actually a lot more powerful than they appear at first glance.
haha, crap. I'll fix that now.
Whoa whoa Kent hold up!
I was making a witty remark :-)
r^2 = 0.55 is that stupid-ass lame-ass dumb-ass argument that Jibble and friends keep trying to throw at you regarding Corsi. That was based off Gabe's work earlier this season.
r^2 = 0.40 is correct I think (from JL's work)
haha, I thought so. Corrected again.
Verrry nice post.
I kinda started using a variation of corsi, that is I give two numbers, the total shots while a player is on ice and the shot differential.
So instead of writing "Scott Gomez is -33 and Laraque is -68", I say "Scott Gomez is 1330/-38 and Laraque 200/-68". Corsi % is another way of saying that but it shaves away the ice time proxy that is inherent to Corsi's.
The habs, as a team, were on ice for 5821 ES Shots this year and are -396 (jeebus...). Putting those numbers ath the end of the table already takes the reader some way toward Gabe's CorsiREL numbers.
Corsi will remain controversial unless the MSM start using it for what it means. Such is life.
I think the real reason corsi is controversial (beyond the fact that it's new) is that it doesn't immediately line up with results in the short term. The Flames, for example, were horrible in terms of outshooting to start the year, but the percentages had them winning anyways. People see bad corsi, but a good record and immediately think 'pffft...useless'.
Fair enough, I guess.
Great summary that you've put together here, Kent.
I would liked to have put together a similar article myself but I'm not nearly as good as explaining things.
Louis Vuitton house has launched the heart purse which is created to cater for the lovers. Heart-shaped purse symbolizes the lovers love each other forever. Louis Vuitton Handbags heart purse is a luxurious monogram-embossed coated leather bag.
Post a Comment