4

Suppose that in a binary classification task, I have separate classifiers A, B, and C. If I use A alone, I will get a high precision, but low recall. In other words, the number of true positives are very high, but it also incorrectly tags the rest of the labels as False. B, and C have much lower precision, but when used separately, they may (or may not) result in better recall. How can I define an ensemble classifier that gives precedence to classifier A only in cases where it labels the data as True and give more weight to the predictions of other classifiers when A predicts the label as False.

The idea is, A is already outperforming others in catching true positives and I only want to improve the recall without hurting precision.

3 Answers3

1

Feature-Weighted Linear Stacking might be what you are looking for.

FWLS combines model predictions linearly using coefficients that are themselves linear functions of meta-features.

In your example you can use the meta-feature "Does A label the example as True?"

Imran
  • 2,381
  • 13
  • 22
1

Based on your description, it looks like different models have different biases. Two important questions: do you have any data imbalance problem? What kind of models you are using? Using stacking based classifier is beneficial if you have different biases. Try to use a simple stack based classifier. For your level-1 classifier, use different models (e.g. SVM-L, SVM-NL, DT, RF, ... etc). For your meta-data, use probabilities and for the meta-classifier use Random Forest.

If you have a data imbalance problem using stack based classifier is a little bit more challenging.

sk1995
  • 3
  • 3
Bashar Haddad
  • 2,049
  • 1
  • 15
  • 18
0

How can I define an ensemble classifier that gives precedence to classifier A only in cases where it labels the data as True and give more weight to the predictions of other classifiers when A predicts the label as False.

Since, your dependency is on your prediction not on the label.
The best and simple approach would be to do it manually -
1. Create 3 Models
2. Predict using all 3
3. Call a function to adjust Classifier weights
4. Call weighted Predict

###Pseudo-code

def cust_wt(p_a, p_b, p_c):
    if (p_a > THRESHOLD):
        weights = 60:20:20
    else:
        weights = 20:40:40

    my_actual_predict(weights) ##Call the voting Classifier


You can use Sklearn's voting classified with different weights.
VotingClassifier

10xAI
  • 5,929
  • 2
  • 9
  • 25