Given a string $S$ of length $n$ characters, is it possible to calculate the hash of its substring $[i, j]$ (from index $i$ to $j$, both inclusive) in $O(1)$ using some form of precomputation? Can we use a modification of the rolling hash?
The following is a similar problem to my question. In it the string was given in a compressed form. Example: if the string is "aaabccdeeee" then the compressed form is:
3 a
1 b
2 c
1 d
4 e
The data was stored in an str[] array as :
str[] = [{'a','3'}, {'b','1'}, {'c','2'}....]
Note that hashing was used in the solutions, which checked if the given substring is a palindrome or not. Given a substring of string $S$ as $(i, j)$, they computed the hash of substring $[i, (i + j) / 2]$ and the reverse hash of substring $[(i+j+2)/2, j]$ and checked if they were equal or not. So if they wanted to check if in string S = "daabac" whether substring $[1, 5]$ is a a palindrome or not, they computed the following :
h1 = forward_hash("aa")
h2 = reverse_hash("ba")
h1 == h2
Code for the hashing concept
The hash precomputation was done as follows:
/* Computing the Prefix Hash Table */
pre_hash[0] = 0;
for(int i = 1;i <= len(str); i++)
{
pre_hash[i] = pre_hash[i - 1] * very_large_prime + str[i].first;
pre_hash[i] = pre_hash[i] * very_large_prime + str[i].second;
}
/* Computing the Suffix Hash Table */
suff_hash[0] = 0;
for(int i = 1;i <= len(str); i++)
{
suff_hash[i] = suff_hash[i - 1] * very_large_prime + str[K - i + 1].first;
suff_hash[i] = suff_hash[i] * very_large_prime + str[K - i + 1].second;
}
And then the hash was computed using the following functions :
/* Calculates the Forward hash of substring [i,j] */
unsigned long long CalculateHash(int i, int j)
{
if(i > j)
return -1;
unsigned long long ret = pre_hash[j] - POW(very_large_prime, [2 * (j - i + 1)])*pre_hash[i - 1];
return ret;
}
/* Calculates the reverse hash of substring [i,j] */
unsigned long long CalculateHash_Reverse(int i, int j)
{
unsigned long long ret = suff_hash[j] - POW(very_large_prime, [2 * (j - i + 1)]) * suff_hash[i - 1];
return ret;
}
What I am trying to do
I am looking for a general approach to the above concept. Given a Pattern $P$, I want to check if the pattern $P$ is present in a string $S$. I know the index $i$ to check where it may be present. And I also know the length of pattern $P$ represented as $|P|$. In short I want to check if hash of $S[i, i + |P|]$ and hash of $P$ match or not in $O(1)$ using some form of precomputation on $S$. (Ignoring the time taken to compute hash of $P$ else it would be $O(1+|P|)$.)