I have a table consisting of some headers $P, Q, R, S$ (shown in blue in Table 1). According to the headers, the column $T$ is populated using some predefined logic.
Now, any of the headers $P,Q,R,S$ may be wildcarded to match all values. The aim to to compress the table using wildcards so that column $T$ is not affected.
As shown in Table 2, Table 3 and Table 4, choosing a combination of fields as key (shown in orange) and wildcarding the rest can be used to compress the table.
However, the choice of key (combination of header fields which are not wildcarded) affects the size of the compressed table.
Given a table, my aim is to automatically detect which combination of headers will yield the smallest table size (compressed) while not affecting column $T$.
So this problem can be split in two parts : 1. Given that you have to have $k$ header fields in the key, what combination will yield smallest table size? 2. What would be the optimal value of $k$?
Note: If case of conflict arising in column $T$ from using wildcards, a non-wildcarded row must be placed above it. (E.g. see Table 4, rows 3 and 4)
So what are some good approaches towards solving this problem? Other than iterating over all combinations and choosing the best? Any help in this regard is appreciated.