Back for more punishment.

Neostats: Quality Shots Project - INTRO

Quality Shots Project - INTRO

I just wanted to write an intro on the Quality Shots data I have been tracking for the 2015/16 Chicago Blackhawks season.  I started this project because of my disappointment with the lack of quality analysis for those covering the Blackhawks.  I just didn’t feel we get accurate information on the actual performance of players.  And in the blog world people just start these memes concerning certain players that just take on a life of their own.  Regardless if there are any facts to back them up.  So I wanted to get a little more actual data involved in the discussion.
What has always interested me is the creativity aspect of the game.  I know there are a lot of people that like the Big Boy hockey aspects but I have always been more interested in why routine plays suddenly turn into actual scoring opportunities.  And that is for both offensive and defensive situations, not just the offensive side of the rink.

3 Areas Tracked
One of the oldest axioms in hockey is for the coach to say “try to shoot from the middle and try to keep the other guys shooting from the outside“. Coaches say this because of the increased chance a shot will go in when it originates from the center of the ice.  So what I am looking for is where do events that generate Quality Shots in the good scoring areas come from?  And in particular who creates those opportunities.  On the other side, I am also interested in what breakdowns happen defensively to create those same chances for the other team.
Chris Boyle has started a project tracking the quality of shots for the NHL.  His work generated a heat chart of shots from all areas of the ice and the likelihood that those shots would score a goal.  Below is his map:
War-on-ice has produced similar information concerning shooting percentages from various locations on the ice.  From work like this we can see that there are really three distinct areas a shot could come from in relationship to the chance those shots will generate a goal.  The first area is the box in front of the net.  The second is the rest of the so called "home plate” area.  And the third is everything else.  I have a diagram showing those three areas.
I have designated those areas as:
  • Primary [note: I have increased the slot to the second hash for the 2015/16 season.]
  • Secondary
  • Outside
Data & Methodology
  • NHL Game Play By Play
  • Excel Spreadsheet
  • Tape of the Game
So what I am doing is putting the NHL generated Play by Play data for each game into a spreadsheet.  Then I am reviewing tape of the game to analyze play at the time of those shot attempts.  And recording the location of the shots as classified under 40 ft. to see if they are in the home plate area.  And then analyzing game tape to see who were primarily responsible for those plays.
What I am looking for is the breakdown of the defense.  What causes the defense to go from having numbers to defend the play, to one where the offensive team gets an advantage generating a quality shot?  This could be a pass a pass into the home plate area or an earlier one that creates an odd man rush, It could be a drive into the middle of the ice, a player in front of the net tipping an outside puck on net, or anyone winning a rebound for another shot into the home plate area.  And a relatively common play is for a Dman to join the rush creating an odd man advantage by beating the opposing teams forwards back into their zone.

Expected Results
From Boyles work you can see that shots have a dramatically different chance of success depending on where they originate.  I am using war-on-ice data to generate League Wide Expected Shooting and Save percentages from each of these three locations.  It is generated from the original NHL shot locations that as Boyle and others have also stated are notoriously inaccurate.  My locations are slightly different than War-on-ice’s so I have had to massage this information to get a close approximation of my data since mine only involves the Hawks and their opponents.  I have worked on these numbers for over a year now and the above are the ones I am currently using.
The last rule of thumb data coming from war-on-ice is the expected ratio of shots from each zone I am tracking.  I found that information to be extremely inaccurate.  Play-by-Play data tends to have shot locations much closer to the net than when the shot originates.  And stadium to stadium have clear biases.  And even stadiums vary greatly as different scorers are used as the season progresses.  So right now I am using last season’s numbers as the norm, especially the opposing team’s numbers.

Small Print
  • 5o5 events
  • Tracking Shots and Goals
  • Both goalies need to be on the ice
  • Only investigating attempts classified as from 40 feet or less.
For those interested I have included the fine print of what I am doing.  This information I think should mostly be self-explanatory.   Most people vi And that is why I treat stick blocks as regular blocks (which the league should be doing anyway).  I only look at when the goalies are on the ice to remove empty net situations since 6o5 situations build in a breakdown of the defense.
One last thing, I am immediately classifying all shots above 40 feet as coming from outside the home plate area.  I know the NHL data is notoriously inaccurate but I am really not willing to spend the extra time looking at multiple additional shots for very little increase in accuracy.  I’m sure there are occasional shots that could be in the Home plate area that are mislabeled as being further out.  But I don’t view there to be many of those situations.  And eliminating shots designated closer but really not in the home plate area is time consuming enough.  And to be honest, I am really not looking for shots on the edges.  In fact the bigger problem is having to eliminate shots from the points that are still marked as under 40 feet.

So what to expect from this data
Once I have this data I can look at team and individual performances.  I now have line and D-pair analysis, as well.  For the team, I have Shot +/- data for each area I am tracking; and that data is also tracked by period.  So you can see this games data compared to season and last season results.  This data also means I can track the ratio of quality shots to overall shots for and against.   This also lets me compare expected shooting and save percentages based on the shot location to the actual results.  This is probably the best data available in determining the “impact of team defense” to the goalie save and gaa numbers.
I can also look at shooting percentages for each area to show whether the differences in team’s shooting percentages come from shot quality or something else.  This can also be done for the goalies as I can generate an expected score based on shooting percentages and numbers of shots for each area I am tracking.
As for individual players I can now look at the expected performance while an individual player was on the ice and compare that to the number of positive and negative plays that player generates.  I can then look at whether certain memes that exist for individuals are justified or if they are simply the result of observational biases and the impact of how the coach uses that particular player.
Anyway, that is my introduction of my project.  If you have any questions please feel free to ask them in the comments section.