I have the following optimization problem:
I have a (random and very noisy) objective function $f(A, P)$, where $A$ is a vector of "observable" parameters of the input and $P$ is the parameters that I can control.
I'd like to find $P(A)$ for every $A$, such that $f(A, P(A))$ is maximized.
For example, I'm writing a game; I know a bunch of facts about every user (their age, gender, level, various in-game statistics), and I can control the difficulty of monsters they face and the scarcity of resources they find on their journey. The objective function $f$ is, for example, how much money they spend.
I think a reasonable approach to this would be to:
1) cluster the users by $A$ in some way, such that within a single cluster, the shape of $f$ with respect to $P$ is approximately the same;
2) run some guided optimization/experimentation algorithm within every cluster, taking advantage of the fact that a single user's data point gives me information about the whole cluster, but avoiding meaningless comparisons between dissimilar users.
The problem here is that I cannot use a conventional clustering method "as is" to cluster by A, because I don't know which dimensions of $A$ are important - and perhaps some dimensions are important only in some regions of the $A$ space. So it's not clear how to formulate the distance function $D(A,A)$.
Is this indeed a reasonable approach? Does it have a name? What's some existing research in this area?