Data Science: A First Introduction (Python Edition) Worksheets

Jupyter notebook worksheets to accompany Data Science: A First Introduction (Python Edition), by Trevor Campbell, Joel Ostblom, and Lindsey Heagy (the book was adapted from the original R textbook Data Science: A First Introduction by Tiffany Timbers, Trevor Campbell and Melissa Lee). To use these worksheets, you can either:

Click on a “launch binder” button to open an interactive, but non-persistent, version of the notebook.
Download this repository by clicking here and follow our computer setup instructions here. The setup instructions should be followed to guarantee that your software environment is compatible with the worksheets.

Regardless of the method you choose to access them, we also recommend reading our Combining code and text with Jupyter chapter before starting out.

Book chapter	View worksheet on GitHub	Launch worksheet on myBinder.org
Python and pandas	view worksheet
Reading in data locally and from the web	view worksheet
Cleaning and wrangling data	view worksheet
Effective data visualization	view worksheet
Classification I: training & predicting	view worksheet
Classification II: evaluation & tuning	view worksheet
Regression I: K-nearest neighbors	view worksheet
Regression II: linear regression	view worksheet
Clustering	view worksheet
Statistical inference (sampling)	view worksheet
Statistical inference (bootstrapping)	view worksheet
Collaboration with version control	view worksheet

Licence

Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)

Acknowledgments

We would like to thank the BinderHub Federation for their kind and generous support of mybinder.org. The interactive versions of these notebooks would not be possible without their efforts.

References

Jupyter et al., “Binder 2.0 - Reproducible, Interactive, Sharable Environments for Science at Scale.” Proceedings of the 17th Python in Science Conference. 2018. doi://10.25080/Majora-4af1f417-011