Skip to main content
Fulltext search
Example 65: Vector spaces + subspace projection (embeddings â zero-mean)
Example 66: Span in ML: Xw and attention as weighted sums
Example 67: Basis + coordinates: one-hot lookup and PCA change-of-basis
Example 68: Linear maps: forward/backward as matrix products
Example 69: Similarity geometry: dot vs cosine in embedding retrieval
Example 6: Least squares residual is orthogonal to column space
Example 70: Orthogonality in regression: residual â column space
Example 71: Rank/null space: many parameters, same predictor
Example 72: Eigenvectors in ML: power iteration for dominant PCA direction
Example 73: PSD in ML: covariance and kernel Gram matrices
Example 74: SVD and conditioning: why normal equations are risky
Example 75: PCA bookkeeping: EVR and reconstruction error
Example 76: Least squares: lstsq vs normal equations
Example 77: Solving systems: Cholesky factor reuse
Example 78: Conditioning: small perturbations â big solution changes
Example 79: Sparse matrices: adjacency COO + O(nnz) matvec
Example 7: Null space explains non-identifiability (overparameterized linear model)
Example 80: Transformers: attention as QK^T then AV
Example 81: Vector spaces + subspace projection (embeddings â zero-mean)
Example 82: Span in ML: Xw and attention as weighted sums
Pagination
First page
Previous page
…
Page
6
Page
7
Page
8
Page
9
Page
10
Page
11
Page
12
Page
13
Page
14
Next page
Last page