ex1.ai

Example 65: Vector spaces + subspace projection (embeddings â zero-mean)

Example 66: Span in ML: Xw and attention as weighted sums

Example 67: Basis + coordinates: one-hot lookup and PCA change-of-basis

Example 68: Linear maps: forward/backward as matrix products

Example 69: Similarity geometry: dot vs cosine in embedding retrieval

Example 6: Least squares residual is orthogonal to column space

Example 70: Orthogonality in regression: residual â column space

Example 71: Rank/null space: many parameters, same predictor

Example 72: Eigenvectors in ML: power iteration for dominant PCA direction

Example 73: PSD in ML: covariance and kernel Gram matrices

Example 74: SVD and conditioning: why normal equations are risky

Example 75: PCA bookkeeping: EVR and reconstruction error

Example 76: Least squares: lstsq vs normal equations

Example 77: Solving systems: Cholesky factor reuse

Example 78: Conditioning: small perturbations â big solution changes

Example 79: Sparse matrices: adjacency COO + O(nnz) matvec

Example 7: Null space explains non-identifiability (overparameterized linear model)

Example 80: Transformers: attention as QK^T then AV

Example 81: Vector spaces + subspace projection (embeddings â zero-mean)

Example 82: Span in ML: Xw and attention as weighted sums