4

Problem:

Given $n$, Calculate $ \sum_{k=1}^{n} k\cdot \mu(k) $

This is the oeis series.

My Thoughts:

I need a sublinear algo, possibly something of the order of $n^{3/4}$ or $n^{2/3}$ time. Any algorithm better than $O(n)$ time complexity is welcome. If you can prove sub-linear is not possible you can tell me that as well.

You can assume $O(n^{2/3})$ memory is available.

My thoughts are in lines of generalising square free counting function. I know efficient ways of counting square free numbers ie $\sum_{k=1}^{n}|\mu(k)\ne0|$ . It uses trick similar to prime counting functions. Since prime counting function can be generalised to get prime summing function, so should be this. Possibly we can do similar generalising for $\sum_{k=1}^{n}|\mu(k)=1|$ and $\sum_{k=1}^{n}|\mu(k)=-1|$ as well. My ideas are still very vague, Needs refining.

Here is a method to sum square free numbers in $O(n^{1/2})$ time. Somehow we would have to now split them into two ie $|\mu(k)=-1|$ and $|\mu(k)=1|$

To summarise:

  • Expected Time complexity: $O(n^{3/4})$ or better.
  • Expected Space complexity: $O(n^{2/3})$ or better.
sibillalazzerini
  • 470
  • 2
  • 10

1 Answers1

2

$$f(x)=\sum_{k\le x} k \mu(k), \qquad \sum_{d\le x} d f(x/d)=1$$ It gives $$\qquad \sum_{d=1} d f(x/d) + \sum_{2\le d \le \sqrt x} d f(x/d) + \sum_{\sqrt x \lt d\le x} d f(x/d) =1$$

When $d \in (\sqrt x, x]$ multiple $d$'s result into same $x/d$, to make use of that setting $\lfloor x/d\rfloor=m$ such that $1\le m< \sqrt{x}$

$$f(x)= 1-\sum_{2\le d\le\sqrt{x}} d f(x/d)- \sum_{1\le m< \sqrt{x}} f(m) \sum_{d\le x,\lfloor x/d\rfloor=m} d $$ $x/d\in [m,m+1)$ iff $d\in (\frac{x}{m+1},\frac{x}m]$ ie. $ \sum_{d\le x,\lfloor x/d\rfloor=m} d=\sum_{d=\lfloor x/(m+1) \rfloor+1}^{\lfloor x/m \rfloor} d$ so that $$f(x)=1-\sum_{2\le d\le \sqrt{x}} d f(x/d)- \sum_{1\le m< \sqrt{x}} f(m) \frac{(\lfloor \frac{x}m \rfloor - \lfloor \frac{x}{m+1}\rfloor) (\lfloor \frac{x}m \rfloor + \lfloor \frac{x}{m+1}\rfloor + 1)}2$$ Calling "compute $f$" recursively and storing the intermediate results $r\mapsto f(r)$ gives a $O(x^{1/2})$ algorithm in space and $\sum_{m\le \sqrt{x}} O(\sqrt{m})+\sum_{d\le \sqrt{x}} O((x/d)^{1/2})= O(x^{3/4})$ in number of calls to "compute $f$" and memory accesses to the stored intermediate values.

sibillalazzerini
  • 470
  • 2
  • 10
reuns
  • 79,880
  • I can't follow the 2nd line ie how you expand $\sum_{d\le x} d f(x/d)=1$ – sibillalazzerini Mar 19 '23 at 22:42
  • I just separate the $d\le \sqrt{x}$ and the $d>\sqrt{x}$ and for the latter I set $m=\lfloor x/d \rfloor$ – reuns Mar 19 '23 at 23:19
  • Oh yeah now got it, I was confused with the second part, for the second part for same $x/d$ there are multiple $d$ and those are continuous so we can club them together and can be summed as difference of triangular number, Essentially by inverting variable domain from $x$ to $x/d$ we convert an $O(x - \sqrt x)$ traversal into an $O(\sqrt x)$ traversal. – sibillalazzerini Mar 19 '23 at 23:39
  • But I don't get your time complexity calculation, first part is fine, but for second part how do you reach at $x^{3/4}$ directly? it should come from solving $T(N)=O(\sqrt N) + T(\sqrt N)$, solving we get $T(N) = O(N)$ – sibillalazzerini Mar 19 '23 at 23:53
  • Use $\lfloor \lfloor x/d\rfloor/d' \rfloor=\lfloor x/(dd')\rfloor$ to get that it will evaluate $f(r)$ only for $r$ in the $O(\sqrt{x})$ values of $\lfloor x/d\rfloor$. – reuns Mar 20 '23 at 00:12
  • I am still not convinced, I can't follow how can some recursive function's time complexity be calculated at one go. Tell me what's wrong with my way of time complexity calculation, that's the standard procedure to calculate time complexity of any recursive function. – sibillalazzerini Mar 20 '23 at 00:24
  • We store the intermediate results $f(r)$, if it's already stored then $f(r)$ doesn't call $f(r/2),f(r/3),\ldots$. If you prefer then call $f(2),f(3)$ until $f(\lfloor x \rfloor)$ then call $f(\lfloor x/\lfloor x \rfloor \rfloor),f(\lfloor x/(\lfloor x \rfloor-1) \rfloor), \ldots,f(\lfloor x/2 \rfloor),f(x)$ in this order. Each call to $f(r)$ does $O(\sqrt{r})$ sums and look up into memory. This is the meaning of my $\sum_{m\le \sqrt{x}} O(\sqrt{m})+\sum_{d\le \sqrt{x}} O((x/d)^{1/2})$. – reuns Mar 20 '23 at 00:27
  • ok got it somewhat even though I haven't reached the number $x^{3/4}$ yet, basically after the first recursive call there is no need for any deeper recursion as all the required f() are ready after the first level call itself. So time complexity is governed primarily by the sums within each f(). So total operations should be, $ total sums = \sqrt(1) + \sqrt(2) + \sqrt(3) + \sqrt(4) + .....\sqrt(\sqrt x) + \sqrt (x/1) + \sqrt (x/2) + \sqrt (x/3) + \sqrt (x/4) + .....+ \sqrt (x/(\sqrt x - 1)) $

    How did you calculate this sum?

    – sibillalazzerini Mar 20 '23 at 01:06
  • 1
    $\approx \int_1^{\sqrt{x}} \sqrt{t}dt+\int_1^{\sqrt{x}} \sqrt{x/t}dt$ – reuns Mar 20 '23 at 01:06
  • Of note the computations of $\lfloor x/ d\rfloor$ probably add some $\log^2 x$ term (so $O(x^{3/4}\log^2 x)$) hope you don't care too much – reuns Mar 20 '23 at 01:10
  • yeah I don't care about logs much but curious how does a division and a floor add $log$ terms, have been considering division and floor as $O(1)$ for decades. – sibillalazzerini Mar 20 '23 at 01:14
  • Masterstroke of your approach was starting with $\sum_{d\le x} d f(x/d)=1$, This seems to be a very important identity that I have never known. – sibillalazzerini Mar 20 '23 at 01:18
  • I played with the similar thing for $\sum_{k\le x} \mu(k)$ many years ago, hoping that the $O(\sqrt{x})$ may be related to the RH, but I got convinced that it wasn't the case because such identities work as well for the Dirichlet inverse of $(-1)^{n+1}$ (which has many poles on $\Re(s)=1$). – reuns Mar 20 '23 at 01:20
  • This rings a bell for me with many similar algorithms where we split the domain into two $\sqrt(n)$ parts one inversely mapped to other. If you notice there are some redundancy in sums as well, for example say x=100, $f(100)$ adds $\sum_{x\le10}f(i)$ and again $f(50)$ adds $\sum_{x\le7}f(i)$. The f() values are cached, the sums can be cached as well such that we only need to add f(8)+f(9)+f(10) to the $\sum_{x<7}f(i)$ already computed earlier. I have seen these kinds of algorithms improve time further to $O(n^{2/3})$ time using bit more space like $O(n^{2/3})$ space. – sibillalazzerini Mar 20 '23 at 02:40
  • I know there is a weight $d$ multiplied with $f()$ at each step which makes things bit more complicated than my oversimplified example above. had it been mere $\sum f(i)$ catching sums would have made it $O(n^{1/2})$ . Nevermind, we can keep that discussions for some other day as for the time being $O(n^{3/4})$ suffices for me. By the way please add that integral way of calculating time complexity to the answer body, it's much more intuitive to understand. As the chat thread grows too long stackexchange may clear it. – sibillalazzerini Mar 20 '23 at 03:09