6

It is easy to find simple distance measures for equal-dimension vectors, such as Euclidean Distance or Correlation. What about unequal-dimension vectors, such as, for instance, $(a,b,c)$ and $(d,e)$? Are there any known approaches in math for that?

For example, one approach would be to consider smaller-dimension vectors to be projected: $(a,b,c)$ and $(d,e,0)$. But then I assumed to which plane and potentially I miss $(d,0,e)$. So there is ambiguity.

Are there any other practically used, especially non-ambiguous measures?

Of course generalization to unequal-dimension, even ragged, data arrays is interesting, so please elaborate if you can.

But some simple computation with $(a,b,c)$ and ($d,e)$ would be very instructive.

iLie
  • 69
  • What is wrong with $||\vec a-\vec b||$? (The lengths of these vectors are diffenet....) – zoli Oct 22 '15 at 12:12
  • Thanks @zoli but how do you compute that say for (a,b,c) and (d,e) ? – iLie Oct 22 '15 at 12:14
  • 2
    Oh! Then the dimension of these vectors is different. – zoli Oct 22 '15 at 12:15
  • @zoli thanks, corrected. – iLie Oct 22 '15 at 12:18
  • I think you really want to give more information about the application. As stated the question makes little mathematical sense. You could compare it to asking how to find the difference in size between a square and line. There is a couple trivial methods but whether any of them make sense depends a huge amount on what you are really trying to do. – DRF Oct 22 '15 at 12:21
  • @DRF I am interested in any known method. For example for "a square and line" I would use orthogonal from the line to the square's geometric center or center of mass or anything else - all of them. I just need a few ideas that were already practiced in math. – iLie Oct 22 '15 at 12:26
  • ILie: The way you interpreted the example with the line and the square masks the real problem since in your head you immediately embedded the 1D line in 2D space which means that in the end you just need to compare two 2D objects. What you really are asking for are appropiate projections/embeddings I guess. (Plus, DRF meant to compare the "size" of a line and a square and was not explicitly talking about a distance. It's just meant to emphasize how hard it is to find relations between non-(/hardly-)related objects.) – Piwi Oct 22 '15 at 14:16

2 Answers2

2

I will just point out the obvious mathematical approach which is probably not really useful though. Assuming your vectors are in $\mathbb{R}^n$ and $\mathbb{R}^m$ for $m<n$ you can always take some injection from $\mathbb{R}^n$ to $\mathbb{R}^m$. The obvious one just pads by $0$'s.

Depending on what kind of analysis you are performing you might have a more natural subspace to which you want to map the smaller vector space. For example you might consider looking at some sort of partial projection. To visualize this consider the fact where you have two dimensional and one dimensional vectors. To make it even more trivial take $(1,0)$, $(0,1)$ as your two dimensional sample and say $(2)$ as your one dimensional vector.

Depending on the type of data you could consider either mapping the one dimensional version so that the direction is the average of the directions of the two dimensional ones (in this case $(1,1)$) and the length could either be conserved (thus actually you would get $(\sqrt{2},\sqrt{2})$) or you could preserve the first coordinate(s) getting $(2,2)$.

For more dimensions the distortion becomes smaller and for larger datasets you might consider looking for example at averages for the undefined dimensions (possibly weighting them depending on similarity of defined dimensions). But all together this will hugely depend on what each dimension represents. If you expect that there is a corellation between the dimensions weighting makes more sense then if you don't etc.

One other approach is to instead of expanding the dimension of the shorter vector instead project the longer one. In particular just for example throw away the extra dimensions. Again a lot of this depends on what those extra dimensions represent and how they tie in to the ones you always have.

DRF
  • 5,282
0

If your point is perpendicular to the projection plane you could just use Pythagoras' theorem. Recall that any point can be defined as a vector. Let $\textbf{q}$ be the higher-dimensional point, $\text{proj}(\textbf{q})$ be the projection of $\textbf{q}$ onto the lower-dimensional plane, and $d$ be the distance between these two points (i.e., vectors). Then $d$ is: $$ d = \sqrt{ \left\lVert q \right\rVert - \left\lVert \text{proj}(\textbf{q}) \right\rVert } $$ Where $\left\lVert \textbf{x} \right\rVert$ is the norm of $\textbf{x}$, i.e., the square root of the sum of the squares.

$\text{proj}(\textbf{q})$ could be found, for example, using the projection matrix, $\it{P}$, whose rows are any number of eigenvectors you choose from a covariance matrix, $\Sigma$. To find $P$ this way I refer you to instructions on how to perform Principal Component Analysis (PCA).