Tracks Linear Algebra for Machine LearningVectors and Vector Operations

LA Linear Algebra for Machine Learning•Foundations: Vectors and Matrices

Vectors and Vector Operations

Learn what vectors are, why they are the fundamental building blocks of machine learning, and how to manipulate them.

50 XP ~24 min Lesson 1 / 10

Why This Matters for AI

Every piece of data that goes into an AI model — every image pixel, every word in a sentence, every user preference — gets turned into a vector. When GPT reads your prompt, it converts each word into a vector of 12,288 numbers. When Spotify recommends a song, it compares vectors representing your taste to vectors representing songs. Vectors aren't just a math concept — they are literally the language that AI speaks. If you don't understand vectors, you can't understand AI. Let's fix that.

The Intuition (No Math Yet)

Think of a vector as a list of numbers that describes something. That's it. A list of numbers. If you describe a house with 3 numbers — [square feet, number of bedrooms, price] — that's a 3-dimensional vector. If you describe a movie with 100 attributes, that's a 100-dimensional vector. The beautiful thing about vectors is that once you represent things as lists of numbers, you can do math on them. You can measure how similar two things are (dot product). You can add them together (vector addition). You can stretch or shrink them (scalar multiplication). An arrow on a piece of paper is just a visual way to think about a 2D vector. But real AI vectors live in hundreds or thousands of dimensions. Don't worry about visualizing those — just think of them as lists of numbers where each number means something. Here's the key intuition: vectors that are "close" to each other (measured by operations we'll learn) represent things that are similar. The word "king" and the word "queen" have vectors that are close to each other in a language model. A photo of a cat and a photo of a dog have vectors that are closer to each other than either is to a photo of a car. This is the foundation of everything in AI.

The Formal Math

What is a vector?

A vector is an ordered list of numbers. We write vectors as column notation or in brackets. A vector in n-dimensional space has n components.

\vec{v} = \begin{bmatrix} v_1 \\ v_2 \\ \vdots \\ v_n \end{bmatrix} \in \mathbb{R}^n

Vector Addition

To add two vectors, add their corresponding components. Both vectors must have the same dimension. Geometrically, this is the "tip-to-tail" method — place the second vector at the tip of the first.

\vec{a} + \vec{b} = \begin{bmatrix} a_1 + b_1 \\ a_2 + b_2 \\ \vdots \\ a_n + b_n \end{bmatrix}

Scalar Multiplication

Multiplying a vector by a number (scalar) scales every component. If the scalar is 2, the vector doubles in length. If it is -1, the vector flips direction. If it is 0.5, the vector shrinks by half.

c \cdot \vec{v} = \begin{bmatrix} c \cdot v_1 \\ c \cdot v_2 \\ \vdots \\ c \cdot v_n \end{bmatrix}

Dot Product

The dot product of two vectors gives a single number that measures how aligned they are. If the dot product is large and positive, the vectors point in similar directions. If it is zero, they are perpendicular (completely unrelated). If negative, they point in opposite directions. This is how AI measures similarity.

\vec{a} \cdot \vec{b} = \sum_{i=1}^{n} a_i b_i = a_1 b_1 + a_2 b_2 + \cdots + a_n b_n

Vector Magnitude (Length)

The magnitude or norm of a vector is its length. It is calculated using the Pythagorean theorem generalized to n dimensions. This is also called the L2 norm or Euclidean norm.

\|\vec{v}\| = \sqrt{v_1^2 + v_2^2 + \cdots + v_n^2} = \sqrt{\sum_{i=1}^{n} v_i^2}

Cosine Similarity

Cosine similarity uses the dot product to measure the angle between vectors, normalized by their lengths. The result is between -1 (opposite) and 1 (identical direction). This is the single most important similarity measure in AI — used in search engines, recommendation systems, and language models.

\cos(\theta) = \frac{\vec{a} \cdot \vec{b}}{\|\vec{a}\| \cdot \|\vec{b}\|}

Interactive Visualization

Interactive: Vector Operations

Drag the sliders to change the vectors and see how operations work visually.

Vector a - x3

Vector a - y2

Vector b - x1

Vector b - y4

Interactive: Dot Product

The dot product measures how much two vectors point in the same direction. Change the angle and see how it affects the value.

Angle (degrees): 45°

|b| magnitude: 3

● Projection (yellow dashed) shows how much of b aligns with a

● Positive dot = same direction

● Negative dot = opposite direction

● Zero dot = perpendicular (90°)

Math → Code Bridge

See the math and its Python equivalent side by side. Same concept, two languages.

Creating Vectors

In NumPy, vectors are 1D arrays. Each element is a component of the vector.

Math

\vec{v} = \begin{bmatrix} 3 \\ 7 \\ 1 \end{bmatrix}

Python / NumPy

import numpy as np

# A vector is just a NumPy array
v = np.array([3, 7, 1])
print(v)        # [3 7 1]
print(v.shape)  # (3,)  — a 3-dimensional vector

Vector Addition

NumPy adds vectors element-by-element, exactly like the math formula.

Math

\vec{a} + \vec{b} = \begin{bmatrix} 1+4 \\ 2+5 \\ 3+6 \end{bmatrix} = \begin{bmatrix} 5 \\ 7 \\ 9 \end{bmatrix}

Python / NumPy

a = np.array([1, 2, 3])
b = np.array([4, 5, 6])

# Addition is element-wise
result = a + b
print(result)  # [5 7 9]

Dot Product

The @ operator is the modern way to compute dot products in Python. It maps directly to the mathematical dot product.

Math

\vec{a} \cdot \vec{b} = (1)(4) + (2)(5) + (3)(6) = 32

Python / NumPy

a = np.array([1, 2, 3])
b = np.array([4, 5, 6])

# Two ways to compute dot product
dot1 = np.dot(a, b)    # 32
dot2 = a @ b            # 32 (@ operator)
dot3 = sum(a * b)       # 32 (manual)

print(dot1)  # 32

Cosine Similarity

np.linalg.norm computes the vector magnitude. This pattern is used everywhere in AI for comparing embeddings.

Math

\cos(\theta) = \frac{\vec{a} \cdot \vec{b}}{\|\vec{a}\| \|\vec{b}\|} = \frac{32}{\sqrt{14} \cdot \sqrt{77}} \approx 0.974

Python / NumPy

a = np.array([1, 2, 3])
b = np.array([4, 5, 6])

# Cosine similarity
cos_sim = np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))
print(f"{cos_sim:.3f}")  # 0.974

# These vectors are very similar in direction!
# (cos_sim close to 1 = similar)

Practice

Practice Problems

Apply what you learned to real AI/ML scenarios.

In a movie recommendation system, a user is represented as [0.8, 0.2, 0.9] (loves action, dislikes romance, loves sci-fi) and a movie is represented as [0.7, 0.1, 0.95] (action, romance, sci-fi scores).

Compute the dot product of the user preference vector and the movie feature vector. What does the result tell you about whether this user would like this movie?

In a word embedding space, the word "neural" is represented as [0.5, 0.8, 0.3, 0.9] and the word "network" is [0.4, 0.7, 0.2, 0.85].

Compute the cosine similarity between the two word embedding vectors. Are these words semantically similar?

You have a data point represented as the vector [3, -1, 4, 2]. You want to normalize and then scale the data.

If you scale this feature vector by 2, what is the new vector? Does the direction change? Does the magnitude change?

Summary

Summary Card

Key Formulas

Vector Addition

\vec{a} + \vec{b} = [a_1+b_1,\, a_2+b_2,\, \ldots]

Dot Product

\vec{a} \cdot \vec{b} = \sum a_i b_i

Magnitude

\|\vec{v}\| = \sqrt{\sum v_i^2}

Cosine Similarity

\cos\theta = \frac{\vec{a} \cdot \vec{b}}{\|\vec{a}\|\|\vec{b}\|}

Key Intuitions

•A vector is a list of numbers that describes something — every data point in AI is a vector.
•The dot product measures similarity: positive = similar direction, zero = unrelated, negative = opposite.
•Cosine similarity normalizes the dot product so vector length does not matter — only direction counts.
•In AI, "close" vectors = similar things (words, images, users, etc.).

AI/ML Connections

•Word embeddings (Word2Vec, GPT): every word is a vector; similar words have similar vectors.
•Recommendation systems: users and items are vectors; dot product predicts ratings.
•Image recognition: images become feature vectors; cosine similarity finds similar images.
•Transformers use dot products in attention mechanisms to decide which words to focus on.

Discussion

Loading discussions...

NextMatrices and Matrix Operations