Skip to main content
ex1

Example 65: Vector spaces + subspace projection (embeddings → zero-mean)

Example 66: Span in ML: Xw and attention as weighted sums

Example 67: Basis + coordinates: one-hot lookup and PCA change-of-basis

Example 68: Linear maps: forward/backward as matrix products

Example 69: Similarity geometry: dot vs cosine in embedding retrieval

Example 6: Least squares residual is orthogonal to column space

Example 70: Orthogonality in regression: residual ⟂ column space

Example 71: Rank/null space: many parameters, same predictor

Example 72: Eigenvectors in ML: power iteration for dominant PCA direction

Example 73: PSD in ML: covariance and kernel Gram matrices

Example 74: SVD and conditioning: why normal equations are risky

Example 75: PCA bookkeeping: EVR and reconstruction error

Example 76: Least squares: lstsq vs normal equations

Example 77: Solving systems: Cholesky factor reuse

Example 78: Conditioning: small perturbations → big solution changes

Example 79: Sparse matrices: adjacency COO + O(nnz) matvec

Example 7: Null space explains non-identifiability (overparameterized linear model)

Example 80: Transformers: attention as QK^T then AV

Example 81: Vector spaces + subspace projection (embeddings → zero-mean)

Example 82: Span in ML: Xw and attention as weighted sums

Pagination

  • First page
  • Previous page
  • …
  • Page 6
  • Page 7
  • Page 8
  • Page 9
  • Page 10
  • Page 11
  • Page 12
  • Page 13
  • Page 14
  • Next page
  • Last page