Vector spaces provide the language for embeddings: addition and scalar multiplication encode interpolation, extrapolation, and mixing of features. Subspaces capture structure (constraints, invariances) and projections implement common preprocessing steps such as centering, de-meaning, and removing nuisance components.
Centering via $P = I - \frac{1}{n}\mathbf{1}\mathbf{1}^\top$ appears in classical statistics (sample mean removal), PCA (covariance computation and principal directions of centered data), and modern deep learning (batch normalization conceptually centers activations). The closure of embeddings under linear combination underlies convex combinations, attention-weighted sums, and residual connections.
Comments