7

In SVM, we have kernel function that maps an input raw data space into a higher dimensional feature space

In CNN, we also have a 'kernel' mask that travels the input raw data space (image as a matrix) and map it to another space.

Given the fact that both these to methods are called 'kernel', I am wondering what is the connection between them from a mathematical perspective.

My guess is that it might have something to do with functional analysis.

eight3
  • 171
  • 1

1 Answers1

3

There is no direct relationship between these two concepts. However we can find some indirect ones.

According to Merriam Webster,

kernel means a central or essential part

which hints why they are called "kernel". Specifically, deciding "how to measure point-point similarity (a.k.a. kernel function)" is the central part of kernel methods, and deciding "what array, matrix, or tensor (a.k.a. kernel matrix) to be convoluted with a data point" is the central part of convolutional neural networks.

A kernel function receives two data points, implicitly maps them into a higher (possibly infinite) dimension, then calculates their inner product.

A kernel matrix (or array, or tensor) is convoluted with one data point to map the data point explicitly into an often lower dimension. Here, we are ignoring a subtle difference between filter and kernel (a filter is composed of one kernel per channel).

Therefore, these two concepts are indirectly related based on mapping to a new representation. However,

  • Kernel functions map implicitly, but kernel matrices map explicitly,
  • Kernel functions cannot be stacked over each other (shallow representation), but kernel matrices can be since the input and output (explicit representations) has the same structure (deep representation),
  • The non-linearity of map is integrated into kernel functions, but for kernel matrices, we should apply a non-linear activation function after the (input, kernel) convolution to reach a similar non-linearity,
  • Implicit representations cannot be learned for kernel functions, a specific function implies a specific representation. However, for kernel matrices, representations can be learned by adjusting (learning) the weights of kernels, and can also be enriched by stacking kernels over each other.
Esmailian
  • 9,553
  • 2
  • 34
  • 49