Other Scholarly Content


Data Mining with Python Public Deposited

No preview available

Download the file

  • Why we need this book

    Data is everywhere and it’s growing at an unprecedented rate. But making sense of all that data is a challenge. Data Mining is the process of discovering patterns and knowledge from large data sets. This book focuses on the hands-on approach to learn Data Mining. This book is designed to give you an understanding of Data Mining concepts in an applicable way. The tutorials in this book will help you to gain practical skills to implement Data Mining techniques in your work. Whether you are a student, a data scientist or a business analyst, this book is a must-read for you.

    How to use this book

    This book is served as a complementary to a theoretical Data Mining course. We intend to keep the introductions brief and simple, and concentrate in detailed tutorials.The book are divided into two parts:

    • Part 1 covers the preparation of data, or Data Wrangling. Part 1 has chapter 1 to chapter 5.
    • Part 2 covers the analysis of data, or Data Analysis. Part 2 has chapter 6 to chapter 10.

    Please find the associated tutorials in each folder. When you run some .ipynb files, if applicable, please make sure the data path is updated in your local/cloud environment. It will be ideal if you not only run the tutorial, but also change parameters and observe the difference. That is the best way to learn. You can download the NotebookTutorials.zip and run the .ipynb files on your device or via cloud services.


    The creation of this work received tremendous support from my graduate assisants, Ajay Sadananda and Bhawneet Singh, who shared many insightful ideas and developed many interesting and useful tutorials in support my Data Mining in person class at CU Boulder. Many of their ideas and tutorials are adopted in this book to benefit our readers. The creation of this work was greatly influenced by the comments and suggestions from my students and colleagues at CU Boulder. I wouldn’t accomplish this work without their inputs.

    The creation of this work was supported by Open CU Boulder 2022-2023, a grant funded by the Colorado Department of Higher Education with additional support from the CU Office of the President, CU Office of Academic Affairs, CU Boulder Office of the Provost, and CU Boulder University Libraries.


Date Issued
  • 2023
Academic Affiliation
Last Modified
  • 2023-03-15
Related URL
Resource Type
Rights Statement


In Collection: