Welcome to scrapenhl2’s documentation!

Introduction

scrapenhl2 is a python package for scraping and manipulating NHL data pulled from the NHL website.

Installation

You need python3 and the python scientific stack (e.g. numpy, matplotlib, pandas, etc). Easiest way is to simply use Anaconda. To be safe, make sure you have python 3.5+, matplotlib 2.0+, and pandas 0.20+.

Next, if you are on Windows, you need to get python-Levenshtein. You can find it here. Download the appropriate .whl file–connect your version of python with the “cp” you see and use the one with “amd64” if you have an AMD 64-bit processor–and navigate to your downloads folder in command line. For example:

cd
cd muneebalam
cd Downloads

Next, install the whl file using pip:

pip install [insert filename here].whl

(Sometimes, this errors out and says you need Visual Studio C++ tools. You can download and install the 2015 version from here.)

Now, all users can open up terminal or command line and enter:

pip install scrapenhl2

(If you have multiple versions of python installed, you may need to alter that command slightly.)

For now, installation should be pretty quick, but in the future it may take awhile (depending on how many past years’ files I make part of the package).

As far as coding environments go, I recommend jupyter notebook or Pycharm Community. Some folks also like the PyDev plugin in Eclipse. The latter two are full-scale applications, while the former launches in your browser. Open up terminal or command line and run:

jupyter notebook

Then navigate to your coding folder, start a new Python file, and you’re good to go.

Use

Note that because this is in pre-alpha/alpha, syntax and use may be buggy and subject to change.

On startup, when you have an internet connection and some games have gone final since you last used the package, open up your python environment and update:

from scrapenhl2.scrape import autoupdate
autoupdate.autoupdate()

Autoupdate should update you regularly on its progress; be patient.

To get a game H2H, use:

from scrapenhl2.plot import game_h2h
season = 2016
game = 30136
game_h2h.game_h2h(season, game)
_images/WSH-TOR_G6.png

To get a game timeline, use:

from scrapenhl2.plot import game_timeline
season = 2016
game = 30136
game_timeline.game_timeline(season, game)
_images/WSH-TOR_G6_timeline.png

To get a player rolling CF% graph, use:

from scrapenhl2.plot import rolling_cf_gf
player = 'Ovechkin'
rolling_games = 25
start_year = 2015
end_year = 2017
rolling_cf_gf.rolling_player_cf(player, rolling_games, start_year, end_year)
_images/Ovechkin_rolling_cf.png

This package is targeted for script use, so I recommend familiarizing yourself with python. (This is not intended to be a replacement for a site like Corsica.)

Look through the documentation at Read the Docs and the examples on Github. Also always feel free to contact me with questions or suggestions.

Contact

Twitter.

Collaboration

I’m happy to partner with you in development efforts–just shoot me a message or submit a pull request. Please also let me know if you’d like to alpha- or beta-test my code.

Donations

If you would like to support my work, please donate money to a charity of your choice. Many large charities do great work all around the world (e.g. Médecins Sans Frontières), but don’t forget that your support is often more critical for local/small charities. Also consider that small regular donations are sometimes better than one large donation.

You can vet a charity you’re targeting using a charity rating website.

If you do make a donation, make me happy and leave a record here.. (It’s anonymous.)

Change log

1/13/18: Various bug fixes, some charts added.

11/10/17: Switched from Flask to Dash, bug fixes.

11/5/17: Bug fixes and method to add on-ice players to file. More refactoring.

10/28/17: Major refactoring. Docs up and running.

10/21/17: Added basic front end. Committed early versions of 2017 logs.

10/16/17: Added initial versions of game timelines, player rolling corsi, and game H2H graphs.

10/10/17: Bug fixes on scraping and team logs. Started methods to aggregate 5v5 game-by-game data for players.

10/7/17: Committed code to scrape 2010 onward and create team logs; still bugs to fix.

9/24/17: Committed minimal structure.

Major outstanding to-dos

  • Bring in old play by play and shifts from HTML
  • More examples
  • More graphs
  • More graphs in Dash app

Indices and tables