Skip to main content
ex1

Example 47: Sparse matrices: adjacency COO + O(nnz) matvec

Example 48: Transformers: attention as QK^T then AV

Example 49: Vector spaces + subspace projection (embeddings → zero-mean)

Example 4: Linear layers and backprop are linear maps + adjoints

Example 50: Span in ML: Xw and attention as weighted sums

Example 51: Basis + coordinates: one-hot lookup and PCA change-of-basis

Example 52: Linear maps: forward/backward as matrix products

Example 53: Similarity geometry: dot vs cosine in embedding retrieval

Example 54: Orthogonality in regression: residual ⟂ column space

Example 55: Rank/null space: many parameters, same predictor

Example 56: Eigenvectors in ML: power iteration for dominant PCA direction

Example 57: PSD in ML: covariance and kernel Gram matrices

Example 58: SVD and conditioning: why normal equations are risky

Example 59: PCA bookkeeping: EVR and reconstruction error

Example 5: Dot products, norms, and cosine similarity (retrieval)

Example 60: Least squares: lstsq vs normal equations

Example 61: Solving systems: Cholesky factor reuse

Example 62: Conditioning: small perturbations → big solution changes

Example 63: Sparse matrices: adjacency COO + O(nnz) matvec

Example 64: Transformers: attention as QK^T then AV

Pagination

  • First page
  • Previous page
  • …
  • Page 5
  • Page 6
  • Page 7
  • Page 8
  • Page 9
  • Page 10
  • Page 11
  • Page 12
  • Page 13
  • …
  • Next page
  • Last page