- Is it necessary to have 50 000 different reducing functions to avoid the collisions?
Yes. If you do not have different functions it is not a rainbow table. It is just a table of hash chains which is less efficient, especially for high coverage tables. However, the functions do not have to be unrelated, just different.
- How do I create those reduction functions (not just take the n-first letters of the hash for instance) ?
Any way you want, as long as it leads to an even distribution and different functions. The simplest is to:
Use the size of the search set (of passwords, keys or what have you) and take the hash value modulo that, so the last bits/bytes if the set size is a power of two.
Then, to create different reduction functions you can add the index of the function to the value. I.e. you add nothing for the first, 1 for the second function and so on. This is sufficient, since the next hashing step will cause these small changes to again give a completely different output.
Finally you turn the value into an element of the search set. E.g. encode it into lowercase letters if that is what you are looking for.
The simplest example is if you are looking for a binary key. For example, you are building a rainbow table for DES, which has 56-bit keys (ignoring parity) and 64-bit output.
You can use the last 56 bits of the output (i.e. the whole output modulo $2^{56}$) and add $i$ (again modulo $2^{56}$) for the $i$th reduction function. Pseudocode:
def R(i, x): # i'th reduction function
return (x + i) % (2**56)
In the case of passwords, suppose you build a table for eight-letter lowercase passwords hashed with MD5. There are $26^8$ such passwords, so you can take the hash value, interpreted as a 128-bit integer and compute (x + i) % (26**8) instead.
(Or you could take the latter half of the hash as a 64-bit integer, then do the same.)