ex1.ai

Example 47: Sparse matrices: adjacency COO + O(nnz) matvec

Example 48: Transformers: attention as QK^T then AV

Example 49: Vector spaces + subspace projection (embeddings â zero-mean)

Example 4: Linear layers and backprop are linear maps + adjoints

Example 50: Span in ML: Xw and attention as weighted sums

Example 51: Basis + coordinates: one-hot lookup and PCA change-of-basis

Example 52: Linear maps: forward/backward as matrix products

Example 53: Similarity geometry: dot vs cosine in embedding retrieval

Example 54: Orthogonality in regression: residual â column space

Example 55: Rank/null space: many parameters, same predictor

Example 56: Eigenvectors in ML: power iteration for dominant PCA direction

Example 57: PSD in ML: covariance and kernel Gram matrices

Example 58: SVD and conditioning: why normal equations are risky

Example 59: PCA bookkeeping: EVR and reconstruction error

Example 5: Dot products, norms, and cosine similarity (retrieval)

Example 60: Least squares: lstsq vs normal equations

Example 61: Solving systems: Cholesky factor reuse

Example 62: Conditioning: small perturbations â big solution changes

Example 63: Sparse matrices: adjacency COO + O(nnz) matvec

Example 64: Transformers: attention as QK^T then AV