16

Denoising signals (in particular, 2D arrays, such as images) can be done by removing the high frequency components of the discrete Fourier transform (which is related to convolution with a Gaussian kernel) or by removing the smallest singular values. I was wondering if there is a known, specific mathematical connection between these two approaches.

I've seen a little discussion on the topic here and here, but I didn't really glean much specifically except the mention of circulant matrices.

user3658307
  • 10,843

1 Answers1

14

There is an entire textbook written about this "Introduction to Inverse Problems and Imaging" by Bertaro and Boccacci. Chapter 4 sets up the formalism which intertwines inverse problems and fourier transforms. Chapter 5 shows how tikhonov regularization is related to windowing functions in fourier transforms. Here, I'll flush out one connection, that may appease your curiosity.

--- The Big Picture ---

(assertion 1), both the SVD and the Fourier transform are characterized by some unitary transform and express our data as an expansion in some basis. (assertion 2) Once the vector is expanded in this basis, a "windowing function" is introduced to devalue the basis vectors which are sensitive to small changes in the data. In practice, a windowing function reduces the expansion coefficient on certain basis vectors based on that vector's associated singular value or frequency. In the extreme case where we set the basis vector's expansion coefficient to zero, we call this "basis truncation".

Explanation of assertion (1): For the SVD a matrix $A=U \Sigma V^T$ where $U$ and $V$ are unitary transforms which rotate the vector into either the row or column space of $A$. The Fourier transform is also a unitary transform into a new basis, see that for $U = e^{i (t \cdot f)}$, then $U U^* = I$.

Explanation of assertion (2): For the SVD, "parts of the basis sensitive to small changes" means the singular vectors (basis vectors) which correspond to small singular values. The small singular values raise the condition number of the matrix and increase the upper bound on the signal to noise ratio. Truncating the basis (eliminating small singular value basis vectors) is often used to fix the problem. For Fourier Transform, "parts of the basis sensitive to small changes" means basis vectors $\exp[i ( t \cdot f)]$ with large frequency, $f$. A well known example is aliasing and a low pass filter or 'frequency truncation' is needed to fix the problem.

--- The math for truncation in SVD ---

Consider the a integral transform with kernel $K$ $$g(t) = \int dt' K(t,t') h(t') $$

We will first consider SVD windowing functions. Start by discretizing the integral domain into $N \times N$ intervals. In particular, our integrals becomes matrix-vector multiplication. $$\vec{g} = \Delta t K \vec{h} $$ Via singular value decomposition, we can write the ordinary least squares (OLS) solution for $h$ as, $$\vec{h} = V \Sigma^{-1} U^T \vec{g}. $$ Which is equivalent to an expansion in the column space of K, $$\vec{h} = \sum_{i=1}^N \frac{1}{\sigma_i} \vec{v}_i \vec{u}_i^T \vec{g} = \sum_{i=1}^N a_i \vec{v}_i $$ Here $a_i \equiv \frac{\vec{u}_i^T \vec{g}}{\sigma_i}$.

In the above solution, all basis vectors are on equal footing. However, a windowing function can devalue certain basis vectors based on $\sigma_i$. Take for example, $$ W^{cut}(\sigma_i | \vec{u}_i, \vec{g}, \epsilon) = a_i \Theta(\sigma_i - \epsilon)$$ where $\Theta$ is the step function. In this windowing function, if $\sigma_i > \epsilon$ then $W_i= a_i$ and $0$ otherwise. Thus, this windowing function is a basis truncation based on $\sigma_i$. The truncated solution is expressed as, $$\vec{h}^{cut} = \sum_{i=1}^N W^{cut}(\sigma_i | \vec{u}_i, \vec{g}, \epsilon) \vec{v}_i $$ Many different windowing functions can be concocted, for example: $$ W^{Tik}(\sigma_i | \vec{u}_i, \vec{g},\epsilon) = a_i \frac{\sigma_i^2}{\sigma_i^2+\epsilon}$$ This is the Tikhonov regularization solution.

Notice that for both windowing functions the $\epsilon \rightarrow 0$ recovers the original the OLS solution.

--- The math for truncation in Fourier transforms ---

If we Fourier transform this problem into frequency space the convolution turns into a multiplication (See E.O. Brigham's ``Introduction to FFT'').

Thus, $g(f) = K(f,f') h(f') $ and we can compute the solution through division. $$h(f') = g(f)/K(f,f')$$ We can Fourier transform over frequency space $\mathcal{F}$ to arrive at a solution. $$h(t') = \int_{\mathcal{F}} df' \exp[i ( t' \cdot f')] h(f')= \int_{\mathcal{F}} df' \exp[i ( t' \cdot f')] \frac{g(f)}{K(f,f')}$$ If you want to introduce some cutoff on high frequency data then you can introduce a windowing function as before: $$ W^{cut}( f' | \epsilon) = \frac{g(f)}{K(f,f')} \Theta(f' - \epsilon)$$ So that $$h(t') = \int_{\mathcal{F}} df \: W^{cut}( f' | \epsilon) \exp[i ( t' \cdot f') $$ Notice that I have left the Fourier bases outside the weight function. I have done this to emphasize that a Fourier transform is really a linear expansion of the solution in the frequency basis.

As with SVD, you could as use the Tikhonov regularization for the frequency basis vectors, $$ W^{Tik}(f' | \epsilon) = \frac{g(f)}{K(f,f')} \frac{\vert K(f,f') \vert^2}{\vert K(f,f') \vert^2 + \epsilon}$$ Or you could select some other windowing function more broadly used by the signal processing community (e.g. Hamming function)


ThomasTuna
  • 439
  • 6
  • 13
  • 2
    Removing the frequencies of small amplitude in the FFT is the same as removing the small eigenvalues of the SVD of the circulant matrix built from the signal (for an image it is a $4d$ circulant tensor) – reuns Apr 24 '19 at 15:51
  • Thanks for the answer. Basically you are saying that either can be written as (loosely speaking) a convolution with a windowing function, removing pieces that are sensitive to small changes. I'm still not clear what the relation between those sensitive pieces are though, or how the remaining signals after the filterings are related. @reuns would you care to expand that to an answer? It's not clear to me what circulant matrices have to do with images. – user3658307 Jun 08 '19 at 19:19
  • Ah I think you're hung up on the linear algebra I assumed. – ThomasTuna Jun 09 '19 at 15:16
  • That assumption was my bad, I have expanded on this in my original answer. Hopefully it helps! – ThomasTuna Jun 09 '19 at 15:39
  • 1
    Thanks, @ThomasTuna. I think the piece that is missing for me is the relation between the information we are cutting out. For instance, is there any clear relation between the basis vectors $v_i$ associated to higher order singular values and the high frequency Fourier basis functions (vectors in the discrete case)? I am interested in how the information (i.e. weighted basis functions/vectors) removed by denoising (using windows as you've noted) in each case is related. It might just be that my signal proc background is a little weak. :) – user3658307 Jun 09 '19 at 20:07
  • 2
    On the wiki for PCA it says ""This transformation is defined in such a way that the first principal component has the largest possible variance (that is, accounts for as much of the variability in the data as possible), and each succeeding component in turn has the highest variance possible under the constraint that it is orthogonal to the preceding components."

    What this means is that the data set itself determines the SVD's basis. Oppositely, the fourier basis decomposition is a priori know. Thus there's no one equation to write down other than some generic similarity/basis transformation

    – ThomasTuna Jun 10 '19 at 00:52
  • Alternatively, this could be said- The SVD and fourier decomposition are both unitary basis transformation, so the effects of the windowing function can be seen by doing a SVD and a fourier decomposition with respective windowing functions and then mapping one set of basis coordinates to the other and comparing the respective vectors. – ThomasTuna Jun 10 '19 at 01:02
  • @ThomasTuna That's a great point. Thanks for the comment (and the answer)! – user3658307 Jul 10 '19 at 02:00