Matrix plus scalar

If $a$ is a scalar and $M$ is a square matrix, it is very convenient to be able to write $a + M$. Usually people know immediately what this means, but are uneasy about “abusing” notation, so here’s the detailed justification for why this is perfectly legitimate:

Matrices should be considered first and foremost as linear transformations. You know what a matrix is if you know what it does to vectors. A scalar is also a linear transform on vectors: multiplying a scalar times a vector is a linear operation. Therefore, scalars can also be thought of as linear transformations, and therefore as matrices. It is immediate which matrix the scalar should be: the result of multiplying by a scalar $k$ is that all components are scaled by $k$; the matrix that does this is just $kI$.

Here’s another way of saying that. Let $A$ be an associative algebra with unity over a field $K$. We can define a homomorphism of algebras $f : K \to A$ via $f(k) = k1_A$. $f$ is an isomorphism since if $j \ne k$, $f(j) = f(k)$ implies $0 = (j-k)^{-1} (j-k) 1_A = 1_A$, a contradiction. $f$ is also the unique nontrivial homomorphism from $K$ to $A$, since algebra homomorphisms must send $1$ to either $0$ or $1$. Therefore, there is a unique copy of $K$ inside $A$, and we can identify $K$ with that copy.

This is a general principle: no one writes $(a + 0i) + z$ when $a$ is real and $z$ is complex, because the complex numbers are considered a superset of the reals. The same is true of matrices: once we identify scalars with scalar diagonal matrices, matrices are just another superset of the reals.

comments powered by Disqus