Overview of Machine Learning Concepts

Overview of Machine Learning Concepts#

Broadly speaking, Machine Learning (ML) can be categorized into two main areas: supervised and unsupervised learning. This chapter provides a brief introduction to these concepts and introduces the notation used throughout the book.

In supervised learning, we have labeled data, meaning each data point comes with an associated “label” or “target” that we aim to predict. The goal is to learn a mathematical model that accurately predicts these labels for new, unseen data. Supervised learning tasks include:

Classification: Predict discrete categories or classes, such as distinguishing benign from malignant tumors.
Regression: Predict continuous numeric outcomes, like temperatures or house prices.

Both classification and regression techniques heavily rely on linear algebra, optimization, and probabilistic methods explored in subsequent chapters.

In unsupervised learning, our dataset lacks explicit labels. Instead, the objective is to identify underlying patterns or structures directly from the data. Key areas include:

Clustering: Grouping similar data points together without prior labels, leveraging concepts from metric spaces and distances.
Representation Learning (Dimensionality Reduction): Reducing the complexity of high-dimensional data while preserving important structural information, relying on linear algebra techniques like Principal Component Analysis (PCA) and Singular Value Decomposition (SVD).

In this chapter, you will gain a clear overview of these fundamental machine learning tasks, preparing you to appreciate the mathematical foundations explored throughout the rest of the book.