Matrix Norm

\( \renewcommand{\vec}[1]{\boldsymbol{#1}} \DeclareMathOperator*{\E}{\mathbb{E}} \DeclareMathOperator*{\Var}{\mathrm{Var}} \DeclareMathOperator*{\Cov}{\mathrm{Cov}} \DeclareMathOperator*{\argmin}{\mathrm{arg\,min\;}} \DeclareMathOperator*{\argmax}{\mathrm{arg\,max\;}} \def\ZZ{{\mathbb Z}} \def\NN{{\mathbb N}} \def\RR{{\mathbb R}} \def\CC{{\mathbb C}} \def\QQ{{\mathbb Q}} \def\FF{{\mathbb FF}} \def\EE{{\mathbb E}} \newcommand{\tr}{{\rm tr}} \newcommand{\sign}{{\rm sign}} \newcommand{\1}{𝟙} \newcommand{\inprod}[2]{\left\langle #1, #2 \right\rangle} \newcommand{\set}[1]{\left\{#1\right\}} \require{physics} \)

1 Definition #

Consider a number field \(K\) which is either real or complex. The matrix norm is a function \(\| \cdot \| : K^{m \times n} \to \RR \) that satisfies the following properties:

For all scalars \(\alpha \in K\) and for all matrices \(A,b \in K^{m \times n}\),

Additionally, in the case of square matrices, some (but not all) matrix norms satisfy the following sub-multiplicative condition.

A matrix norm that satisfies this additional property is called a sub-multiplicative norm

2 Operator Norm #

Suppose a vector norm \(\| \cdot \|\) on \(K^m\) and \(K^n\) is given, then we define the corresponding induced norm or operator norm on the space \(K^{m\times n}\) as follows:

\[\begin{align} \|A\| &=\sup \left\{ \|Ax\|: x\in K^n, \|x\|=1 \right\}\\ &=\sup \left\{ \|Ax\|: x\in K^n, \|x\|\leq 1 \right\}\\ &=\sup \left\{ \frac{\|Ax\|}{\|x\|}: x\in K^n, x\neq 0 \right\} \end{align}\]

The last equality is usually reformed and used as an inequality:

\[\|Ax\| \leq \|A\|\|x\|\]

Any induced operator norm is a sub-multiplicative matrix norm. This follows from:

\[\|ABx\|\leq \|A\|\|Bx\|\leq \|A\|\|B\|\|x\|\]

and

\[\max_{\|x\|=1} \|ABx\| = \|AB\|\]

3 Frobenius Norm #

Frobenius norm treats an \(m \times n\) matrix as a vector of size \(m \cdot n\):

\[\|A\|_F = \sqrt{\langle A,A\rangle _{F}}\]

where \(\langle A,A\rangle_{F}\) is the Frobenius inner product, defined as

\[\langle A,A\rangle_{F} = \sum_{i,j} \overline{A_{ij}}B_{ij} = \tr \left( \overline{A^T}B \right) = \tr \left( A^{\dagger}B \right)\]


Links to this note