The problem is (integer) duplicate removal, which can also be perceived as producing the image of an evaluated function (of integers):
Given a sequence $S_\text{in}$ of $n$ integers, produce a sequence $S_\text{out}$ of elements such that any element in $S_\text{in}$ also appears in $S_\text{out}$, and all elements in $S_\text{out}$ are distinct.
Ignoring some details regarding the elements' size in bits, this can be done by sorting in $O(n \log n)$ time; or by hashing in expected time $O(n)$.
Given that we don't require $S_\text{out}$ to be sorted - is it possible to do better than sorting in the worst case?
Notes:
- Some dependence on the output size (let's call it $m$) rather than $n$ is an improvement, but I'm mostly interested in getting closer to linearity in $n$.
- The details I've ignored may not be so insignificant.
- Algorithms need not be restricted to algebraic computation, i.e. you can tear into the bit representation if it helps.
- We cannot make any assumptions regarding the input distribution.