In signal processing, a classic problem consists in estimating a signal, in the form of a complex column vector , by observing a related signal
, which has been produced by multiplying the unknown
by a known matrix
and adding noise
:
1. Assumptions
The random signal and noise
are independent of each other, Gaussian distributed with zero mean and covariance matrix
and
, respectively, where
is the conjugate transpose. In symbols,
,
. To simplify computations, we further assume the signal to be uncorrelated, i.e.,
.
2. MMSE estimate
The minimum mean squared error (MMSE) estimate is that function that guesses the unknown
upon observing
while minimizing the squared estimation error
, computed in expectation across all possible realization of
and
:
.
As already discussed in another post, the MMSE estimate corresponds to the mean of unknown given the observation
:
Intuitively, the best (in the MMSE sense) guess for the unknown is its expected value, conditioned on the observation.
2.1. A more specific formula
Since and
are jointly Gaussian with zero mean, the MMSE estimate
can be computed explicitly as:
We can further specialize the expression above by observing that:
Note that in both expressions above we exploited the fact that noise and signal are uncorrelated. By plugging (5) and (6) into (4), we obtain:
3. Let’s simplify our life by whitening the noise
To further simplify (7) it is convenient to whiten the noise, i.e., to deal with an equivalent model where the noise covariance matrix is diagonal. This can be simply achieved by pre-multiplying by the matrix
(which exists since
is positive semi-definite, like for any covariance matrix), where
is the noise power. Then, we obtain:
where ,
,
. We can check that the equivalent noise
is white by computing its covariance matrix:
We can then rewrite (7) in simpler terms as:
By invoking the matrix inversion lemma, we can equivalently write:
4. Corner (but illuminating) cases
To develop a better understanding of how MMSE estimate works, it is interesting to investigate its behavior in a few extreme but important cases.
4.1. Negligible noise and : Inverse
We start from the simplest sub-case: in the absence of noise, and assuming that signal and observation have the same size and assuming
invertible, then MMSE simply inverts
:
4.2. Negligible noise: Pseudoinverse
A natural question arises: what if instead, while noise still being small? In this case, (9) tends to:
which is the pseudoinverse of .
In communication theory, this estimator is also called zero-forcer. In fact, if we pre-multiply the signal by
, then the resulting
-th component only depends on the
-th input signal, hence eliminating any interference among different signal components:
4.3. Uni-dimensional signal: Matched filter
Let us turn to another simple corner case: we assume the signal to be uni-dimensional (while the observation is still multi-dimensional). In this case, we can write
where we
as it is a column vector. In this case, the MMSE estimate boils down to:
The signal is projected onto
and appropriately scaled. This is called matched filter.
4.4. Orthogonal : Matched filter
Let us now expand the case above by returning to the original multi-dimensional signal , while assuming the matrix
to be orthogonal:
, i.e., its columns are pairwise orthogonal. In this case, the MMSE estimate becomes:
In words, to estimate the -th signal component
, the MMSE estimate projects the observation
onto the
-th column of
and scales the result appropriately. Notice that (13) is indeed the special case of (14) when
.
4.5. Noise is overwhelming: No info
From (14),(13) it is apparent that, when the noise drowns the signal, the MMSE estimate tends to zero, whatever the observed signal is. This behavior is supported by intuition: when does not bring any useful information on the signal
, the best estimate for
is our prior (unconditioned) information:
.

References
[1] Kay, S. M. (1993). Fundamentals of statistical signal processing: estimation theory. Prentice-Hall, Inc..
Leave a comment