UC Irvine, Math 10, Spring 2023#

Use the Navigation menu on the left to find the course content.

Introduction to Programming for Data Science

About the class. The goal of this course is to introduce programming in Python, with an emphasis on some of the tools that are most relevant to data science. There are two primary parts of the course:

  • Part 1. Exploratory Data Analysis

  • Part 2. Introduction to Machine Learning

Learning Outcomes. The primary learning outcomes for Math 10 are that students will be able to:

  • select appropriate data types (both built-in Python types as well as types defined in external libraries) when performing computations;

  • write code which is Pythonic (for example, avoiding unnecessary for loops) and adheres to the DRY (Don’t Repeat Yourself) principle;

  • manipulate structured data using NumPy and pandas;

  • given unfamiliar data, use techniques specific to the pandas library to gain a deeper understanding of the data;

  • produce interactive visualizations conveying significant aspects of datasets using Altair;

  • select a suitable machine learning algorithm for a given task and implement it using scikit-learn;

  • assess the performance of a machine learning algorithm using a loss function;

  • recognize the potential for overfitting when evaluating a machine learning algorithm, and how to detect overfitting using a test set;

  • write a data-focused Deepnote notebook using a combination of code cells and explanatory markdown cells.

Earlier versions of these notes