Skip to main content
Fulltext search
Example 47: Sparse matrices: adjacency COO + O(nnz) matvec
Example 48: Transformers: attention as QK^T then AV
Example 49: Vector spaces + subspace projection (embeddings â zero-mean)
Example 4: Linear layers and backprop are linear maps + adjoints
Example 50: Span in ML: Xw and attention as weighted sums
Example 51: Basis + coordinates: one-hot lookup and PCA change-of-basis
Example 52: Linear maps: forward/backward as matrix products
Example 53: Similarity geometry: dot vs cosine in embedding retrieval
Example 54: Orthogonality in regression: residual â column space
Example 55: Rank/null space: many parameters, same predictor
Example 56: Eigenvectors in ML: power iteration for dominant PCA direction
Example 57: PSD in ML: covariance and kernel Gram matrices
Example 58: SVD and conditioning: why normal equations are risky
Example 59: PCA bookkeeping: EVR and reconstruction error
Example 5: Dot products, norms, and cosine similarity (retrieval)
Example 60: Least squares: lstsq vs normal equations
Example 61: Solving systems: Cholesky factor reuse
Example 62: Conditioning: small perturbations â big solution changes
Example 63: Sparse matrices: adjacency COO + O(nnz) matvec
Example 64: Transformers: attention as QK^T then AV
Pagination
First page
Previous page
…
Page
5
Page
6
Page
7
Page
8
Page
9
Page
10
Page
11
Page
12
Page
13
…
Next page
Last page