0

I'm coming from more of a coding background and don't have much of a math background other than linear algebra. I generated multiple plots of distributions of vector dot products with each vector element taken from a uniform distribution between -1 and 1. I noticed the variance follows a trend of $1/n$.

After trying to figure out why that's the case I came across this post giving a general explaianation of why this is the case. However, I wanted to find a more formal proof of why this was the case so I kept looking and found another post giving the full proof here.

However, this proof showed that variance is $n$ instead of $1/n$. And in the case of sampling from a uniform distribution, it seems like the variance would become $\inf$. So I'm not sure what's going on here.

Variance simulations

Code

import numpy as np
import matplotlib.pyplot as plt

def random_vector(dim,elements): rand_vec = np.random.uniform(-1,1,size=(elements,dim)) norms = np.transpose(np.tile(np.linalg.norm(rand_vec,axis=1),(dim,1))) return rand_vec/norms

elms = 10000 dims = [10,20,50,100,250,500,1000,2000] fig, axs = plt.subplots(len(dims),1,figsize=(5,len(dims)*5)) for idx,dim in enumerate(dims): vec1 = random_vector(dim,elms) vec2 = random_vector(dim,elms) dot_prod = np.einsum("ij,ij->i",vec1,vec2) dot_mean = np.mean(dot_prod) dot_std = np.std(dot_prod)

axs[idx].hist(dot_prod,bins=64)
axs[idx].set_title("dim="+str(dim)+", mean="+str(round(dot_mean,3))+", std="+str(round(dot_std,3)))

Choop
  • 11

0 Answers0