UC Irvine, Math 10, Spring 2022
UC Irvine, Math 10, Spring 2022¶
Introduction to Programming for Data Science
Use the Navigation menu on the left to find the course content. More material will be posted throughout the course.
Course-level Learning Outcomes
The goal of this course is to introduce programming in Python, with an emphasis on some of the tools that are most relevant to data science. The primary learning outcomes for Math 10 are that students will be able to:
select appropriate data types (both built-in Python types as well as types defined in external libraries) when performing computations;
write code which is Pythonic (for example, avoiding unnecessary for loops) and adheres to the DRY (Don’t Repeat Yourself) principle;
given an unfamiliar dataset, apply techniques of Exploratory Data Analysis (EDA) to gain a rapid first-impression of the dataset’s contents;
manipulate structured data using NumPy and pandas;
produce interactive visualizations conveying significant aspects of datasets using Altair;
select a suitable machine learning algorithm for a given task and implement it using scikit-learn;
assess the performance of a machine learning algorithm using a loss function;
recognize the potential for overfitting when evaluating a machine learning algorithm, and how to detect overfitting using a test set;
write a data-focused Deepnote notebook using a combination of code cells and explanatory markdown cells.
Earlier versions of these notes