Jupyter notebook worksheets to accompany Data Science: A First Introduction (Python Edition), by Trevor Campbell, Joel Ostblom, and Lindsey Heagy (the book was adapted from the original R textbook Data Science: A First Introduction by Tiffany Timbers, Trevor Campbell and Melissa Lee). To use these worksheets, you can either:

  1. Click on a “launch binder” button to open an interactive, but non-persistent, version of the notebook.

  2. Download this repository by clicking here and follow our computer setup instructions here. The setup instructions should be followed to guarantee that your software environment is compatible with the worksheets.

Regardless of the method you choose to access them, we also recommend reading our Combining code and text with Jupyter chapter before starting out.

Book chapter View worksheet on GitHub Launch worksheet on myBinder.org
Python and pandas view worksheet Binder
Reading in data locally and from the web view worksheet Binder
Cleaning and wrangling data view worksheet Binder
Effective data visualization view worksheet Binder
Classification I: training & predicting view worksheet Binder
Classification II: evaluation & tuning view worksheet Binder
Regression I: K-nearest neighbors view worksheet Binder
Regression II: linear regression view worksheet Binder
Clustering view worksheet Binder
Statistical inference (sampling) view worksheet Binder
Statistical inference (bootstrapping) view worksheet Binder
Collaboration with version control view worksheet Binder


Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)


We would like to thank the BinderHub Federation for their kind and generous support of mybinder.org. The interactive versions of these notebooks would not be possible without their efforts.


Jupyter et al., “Binder 2.0 - Reproducible, Interactive, Sharable Environments for Science at Scale.” Proceedings of the 17th Python in Science Conference. 2018. doi://10.25080/Majora-4af1f417-011