7

Let a function $f$ to be $x\in \left[a,b\right],\:0\le f\left(x\right)\le c$.
We want to calculate the approximation of the definite integral of the function in the range $[a,b]$, we can suppose that the exact integral is very difficult to calculate in this range, but we can for all $x$ calculate $f(x)$ easily.
We can sample a lot of points randomly $\left\{X_i,Y_i\right\}\:_{i=1}^N$ from the rectangle in the range: $x\in[a,b], y\in[0,c]$.

  1. First of all, we need to find a way to calculate approximately the integral in the range $[a,b]$

My way:
I succeeded to calculate and this will be $s'$. ($s$ - the original integral, $s'$ - the approximate integral)
$s = \int _a^b\:f\left(x\right)dx$
$s'=c\left(b-a\right)\cdot \frac{1}{n}\sum _{j=1}^n\:I_j$

We need to calculate with $a,b,c,s,\epsilon,\delta$ how many points we need to sample for: $p(|s'-s|>\epsilon)<\delta$ and we need also to be helped by Chebyshev's inequality, but I have no idea how to go on with it.

  • 1
    How did you transform the points $(X_i,Y_i)$ into $s'$, and how did you draw $(X_i,Y_i)$ in the first place? (I suspect you probably did $s'=\frac{n}{N} (b-a)c$ where $N$ is the number of points and $n$ is the number of times that $Y_i \leq f(X_i)$.) – Ian Feb 03 '21 at 22:57
  • Yes, that exactly what I do and then we got: $\int _a^b:f\left(x\right)dx\approx c\left(b-a\right)\cdot \frac{1}{n}\sum _{j=1}^nI_j$ when then Indicator got $1$ when the point is in the range of the integral –  Feb 03 '21 at 23:06
  • 1
    Then you have a sum of $N$ iid random variables which are equal to either $\frac{(b-a)c}{N}$ with some probability or $0$. What are those two probabilities? What is the resulting variance of each variable? Once you have that, you can just add the variances and apply Chebyshev's inequality (though the result you will get from Chebyshev is extremely non-optimal). – Ian Feb 03 '21 at 23:07
  • @Ian Sorry, but I really don't know how to calculate this formally –  Feb 03 '21 at 23:11
  • 1
    Well, the first step is to calculate those two probabilities. If you're stuck, try drawing a picture. – Ian Feb 03 '21 at 23:12
  • @Ian What are those 2 probabilities you mean? When the point is in the range of the integral whether it's not? If yes how do you calculate this? –  Feb 03 '21 at 23:22
  • 1
    Yes. As I said, if you draw a picture it should be pretty clear. Note that in the prompt the answer is specifically allowed to depend on the unknown quantity $s$. – Ian Feb 03 '21 at 23:33
  • @Ian It may be something like $\frac{s}{c\left(b-a\right)}$? –  Feb 04 '21 at 06:46

1 Answers1

3

Since $f$ is bounded, you can bound the variance of your Monte-Carlo estimate, $ I_N = (1/N) \sum_i f(x_i) $:

$ \sigma^2 = Ε(I_N - Ef)^2 \le c^2(b-a)^2/N $

To see this, use the i.i.d. zero mean variable $y_i = f(x_i) - Ef, |y_i| \le c$:

$Ε(I_N - Ef)^2 = (1/N^2)E\Big[ \Big( \sum_i y_i \Big)^2 \Big] = (1/N^2)E\Big[\sum_i y_i^2 + 2 \sum_{i, j > i} y_iy_j \Big] = (1/N)E[y_1^2]$

You then use this in the Chebyshev approximation:

$ P(|I_N - Ef| > \epsilon) < \sigma^2/\epsilon^2 \le \frac{c^2(b-a)^2}{\epsilon^2N} = \delta$

Jim
  • 506
  • I found that the approximate integral is $c\left(b-a\right)\cdot \frac{1}{n}\sum _{j=1}^n:I_j$. Where did the $c(b-a)$ disappear when you calculate $I_N$? –  Feb 04 '21 at 11:31
  • @Xavi Jim is doing something else, where you sample only $X_i$ and say that the contribution to integration from each sample is $f(X_i)\frac{b-a}{N}$. This actually converges a lot faster than the "throwing darts" method that we talked about in the comments. – Ian Feb 04 '21 at 14:13
  • @Ian Can you write it formally? I'm a bit confused. –  Feb 04 '21 at 15:18
  • Please correct me if I am wrong, but it seems that union bound for $\sigma^2$ is $c^2(b-a)^2$. That $c^2(b-a)^2/N$ is probably an upper bound, but I cannot see where it came from. – Arash Feb 04 '21 at 17:06
  • @Arash To start,$\operatorname{Var}(f(X_i))=E[f(X_i)^2]-E[f(X_i)]^2=\frac{1}{b-a} \int_a^b f(x)^2 dx - \frac{1}{(b-a)^2} \left ( \int_a^b f(x) dx \right )^2$. Now assuming you don't want your bound to involve $s$, you simply have to invoke the lower bound on $f$ to say that this is at most $c^2$. (If you can use $s$, then you get the better bound of $c^2-\frac{s^2}{(b-a)^2}$.) – Ian Feb 04 '21 at 19:03
  • (Cont.) Then the value for the integral will be $E[(b-a)f(X_i)]$ so the variance of those summands is $(b-a)^2 c^2$. Finally the division by $N$ comes when you average. (Of course this isn't what the OP is actually trying to implement anyway...) – Ian Feb 04 '21 at 19:04