Matrix plus scalar

If a is a scalar and M is a square matrix, it is very convenient to be able to write a+M. Usually people know immediately what this means, but are uneasy about “abusing” notation, so here’s the detailed justification for why this is perfectly legitimate:

Matrices should be considered first and foremost as linear transformations. You know what a matrix is if you know what it does to vectors. A scalar is also a linear transform on vectors: multiplying a scalar times a vector is a linear operation. Therefore, scalars can also be thought of as linear transformations, and therefore as matrices. It is immediate which matrix the scalar should be: the result of multiplying by a scalar k is that all components are scaled by k; the matrix that does this is just kI.

Here’s another way of saying that. Let A be an associative algebra with unity over a field K. We can define a homomorphism of algebras f:KA via f(k)=k1 A. f is an isomorphism since if jk, f(j)=f(k) implies 0 =(jk) 1 (jk)1 A=1 A, a contradiction. f is also the unique nontrivial homomorphism from K to A, since algebra homomorphisms must send 1 to either 0 or 1 . Therefore, there is a unique copy of K inside A, and we can identify K with that copy.

This is a general principle: no one writes (a+0 i)+z when a is real and z is complex, because the complex numbers are considered a superset of the reals. The same is true of matrices: once we identify scalars with scalar diagonal matrices, matrices are just another superset of the reals.

Tags: , ,

Leave a Reply