6

Given an array $A$, sum the number of unique elements for each sub-array of $A$. If $A = \{1, 2, 1, 3\}$ the desired sum is $18$.

Subarrays:

{1} - 1 unique element
{2} - 1
{1} - 1
{3} - 1
{1, 2} - 2
{2, 1} - 2
{1, 3} - 2
{1, 2, 1} - 2
{2, 1, 3} - 3
{1, 2, 1, 3} - 3

I have a working solution which sums the unique elements for all sub-arrays starting at index $0$, then repeats that process at index $1$, etc. I have noticed that for an array of size $n$ consisting of only unique elements, the sum I desire can be found by summing $i(i + 1) / 2$ from $i = 1$ to $n$, but I haven't been able to extend that to cases where there are non-unique elements. I thought I could use that fact on each sub-array, but my control logic becomes unwieldy. I've spent considerable time trying to devise a solution better than $O(n^2)$ to no avail. Is there one?

Secondary question: if there is no better solution, are there any general guidelines or hints to recognize that fact?

xskxzr
  • 7,613
  • 5
  • 24
  • 47

4 Answers4

1

It's easy to do this using the technique of finding individual contribution of each element.

Consider a subarray, each unique element contributes 1 to answer, but if there are multiple occurences of it, only one will contribute to answer.

so rather than finding distinct elements for each subarray, we take an element and count all subarrays where it will contribute an answer to.

An element should only be considered once per subarray, so we need to define which element must be counted, by counting only first occurrence of each distinct element we can avoid overcounting.

We take an element, and count all subarrays where it is first occurrence of it's value, ie. subarrays which start after previous occurrence of current value, then current element will be first occurrence in this subarray.

Algorithm is as follows:

  1. store last occurrence of each element, initially it's -1
  2. traverse array from left to right
  3. contribution of this element will be $ (i - last[arr[i]]) \times (n - i) $, add to answer
  4. update last occurrence index of current value

It will work in $ O(N) $ time and $ O(N) $ space complexity.

Here's implementation along with test generator in Python.

https://ideone.com/vNIJMG

0

Hint: Use an extra $O(n)$-spaced array of pointer to the last (biggest) index of each distinct value. Then, this array is very helpful when you do your above algorithm.

Lastly, we should have an $O(n)$ algorithm.

Thinh D. Nguyen
  • 2,313
  • 3
  • 24
  • 71
0

Let $a_1,\ldots,a_m$ be the distinct values. Now consider the positions of $a_i$'s in $A$. Assume the number of $a_i$'s is $b_i$ and the positions are as follows:

(x_{i0} non-a_i's) a_i (x_{i1} non-a_i's) a_i ... a_i (x_{ib_i} non-a_i's)

In your example $A=\{1,2,1,3\}$, when considering the value $a_1=1$, we have $x_{10}=0,x_{11}=1,x_{12}=1$ because the positions of $1$s are like 1 * 1 *: there is $0$ element before the first $1$, $1$ element between the first $1$ and the second $1$, and $1$ element after the second $1$.

Then there are \begin{align} &\sum_{j=0}^{b_i} \frac{x_{ij}(x_{ij}+1)}{2}\\ =\ &\frac{1}{2}\sum_{j=0}^{b_i}x_{ij}^2+\frac{1}{2}\sum_{j=0}^{b_i}x_{ij}\\ =\ &\frac{1}{2}\sum_{j=0}^{b_i}x_{ij}^2+\frac{1}{2}(n-b_i) \end{align} subarrays that do not contain $a_i$. Note there are $n(n+1)/2$ subarrays in total, so the final result we want is \begin{align} &\sum_{i=1}^m\left(\frac{n(n+1)}{2}-\frac{1}{2}\sum_{j=0}^{b_i}x_{ij}^2-\frac{1}{2}(n-b_i)\right)\\ =\ &\frac{1}{2}\left(mn^2+n-\sum_{i=1}^m\sum_{j=0}^{b_i}x_{ij}^2\right). \end{align}

To calculate $\sum_{i=1}^m\sum_{j=0}^{b_i}x_{ij}^2$, you can scan the array and maintain a lookup table that, for each distinct value, keeps the position of the last element with this value. With this table, when the $(j+1)$-th $a_i$ is scanned, you can compute $x_{ij}$ easily. This leads us to an $O(n\log n)$ solution (or $O(n)$ in average if you use a good hash table to implement the lookup table).

xskxzr
  • 7,613
  • 5
  • 24
  • 47
0

If the maximum number $k$ is not too high, a Counting Sort based solution can be used;

In the first step, we count the number of elements into the array $C$ by

for j = 1 to length(A)
    C[A[j]] = C[A[j]]+1

now the array contains all the information we need to sum;

sum = 0;
for j = 1 to k
    if C[i] > 1
         sum =  C[i] * i

Complexity:

$\mathcal{O}(n)$ additions (increment) for the counting step

$\mathcal{O}(n)$ additions for the summation step

$\mathcal{O}(n)$ multiplications for the summation step

result $\mathcal{O}(n)$.

Note: indeed the multiplications are not necessary since we will have at most $n-1$ addition.

kelalaka
  • 1,171
  • 1
  • 10
  • 19