grid search - optimal weighting of classifiers

Question

I am using three different of the shelf classifiers. It's a three class classification task. I want to calculate the optimal weights (c1weight, c2weight, c3weight) for each classifier (real task more classifiers and also weights for each class).

Maybe simple grid search approach or sklearn ensemble classifier could do that.

vc = VotingClassifier(estimators=[('gbc',GradientBoostingClassifier()),
                   ('rf',RandomForestClassifier()),('svc',SVC(probability=True))],
                   voting='soft',n_jobs=-1)
params = {'weights':[[1,2,3],[2,1,3],[3,2,1]]}
grid_Search = GridSearchCV(param_grid = params, estimator=vc)
grid_Search.fit(X_new,y)
print(grid_Search.best_Score_)

I don't understand how to implement this for the following code.

def get_classification(text, c1weight, c2weight, c3weight):
prediction1 = classifier1.predict(text)
  if prediction1 = 1:
    class1 =+ 1 * c1weight
  elif prediction1 = 2:
    class2 =+ 1  * c1weight
  else:
    class3 =+ 1  * c1weight
prediction2 = classifier2.predict(text)
  if prediction2 = 1:
    class1 =+ 1 * c2weight
  elif prediction2 = 2:
    class2 =+ 1  * c2weight
  else:
    class3 =+ 1  * c2weight
prediction3 = classifier3.predict(text)
  if prediction3 = 1:
    class1 =+ 1 * c3weight
  elif prediction3 = 2:
    class2 =+ 1  * c3weight
  else:
    class3 =+ 1  * c3weight
if class1 > class2 and class1 > class3:
    return ("class1",class1)
  elif class2 > class1 and class2 > class3:
    return ("class2",class2)
  else:
    return("class3",class3)
c1weight = 0.5
c2weight = 0.7
c3weight = 0.4
for i, row in df_raw.iterrows():
    classification = get_classification(df_raw.at[i, 'text'],c1weight, c2weight, c3weight)
    df_raw[i,'classification'] = classification
score = get_accuracy(df_raw['classification'],df_raw['label'])

score 1 · Answer 1 · answered Oct 10 '19 at 05:11

GridSearch finds those optimals weights for you.

You can access these weights through the attribute best_params_ of the GridSearch object, which will return all the optimal parameters (including the weights):

optimal_weights = grid_Search.best_params_

score 0 · Answer 2 · answered Oct 02 '23 at 13:19

(This is the asker's solution, moved from question and comment to answer)

This sample code helped me to understand it:

def your_function(number):
    print(number)
from sklearn.model_selection import ParameterGrid
param_grid = {'param1': [1, 2, 3]}
grid = ParameterGrid(param_grid)
for params in grid:
    your_function(params['param1'])

I had too much paramaters for gridsearch. In this case it would take months to calculate all combinations. Finally i used hyperopt for the hyperparameter optimization. There are some nice basic tutorials out there. This one helped me a lot. You can also find a python notebook there. https://towardsdatascience.com/an-introductory-example-of-bayesian-optimization-in-python-with-hyperopt-aae40fff4ff0

grid search - optimal weighting of classifiers

2 Answers2