I have tried to generate a dataset with a 95:5 class imbalance, train a RandomForestClassifier model, and calculate AUPRC and AUC-ROC and Average Precision (AP) scores for the binary classification task:

My observations show that $AP$ is always slightly greater than $AUPRC$: $$AUPRC<AP<<AUCROC$$

There could be a chance that $AUPRC>=AP$ which could be reasoned in the interpolation method. scikit-learn doesn't explicitly document the interpolation method used in the precision_recall_curve and average_precision_score functions.
From the scikit-learn documentation states that:
average_precision_score function calculates the area under the precision-recall curve using the trapezoidal rule.
However, it doesn't explicitly mention the interpolation method used for precision-recall points. ref.
precision_recall_curve computes precision-recall pairs for different probability thresholds and uses linear interpolation to estimate precision values at different recall levels.
Python code:
import matplotlib.pyplot as plt
import numpy as np
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import precision_recall_curve, auc, average_precision_score, roc_curve, roc_auc_score
Generate imbalanced data with labels for positive (1) and negative (0) classes
X, y = make_classification(n_samples=1000, n_features=20, n_classes=2,
weights=[0.95, 0.05], random_state=42)
Count the number of observations for each class
num_positive_class = np.sum(y == 1)
num_negative_class = np.sum(y == 0)
Scatter plot for binary class distribution
plt.figure(figsize=(8, 8))
plt.scatter(X[y == 1, 0], X[y == 1, 1], c='blue', edgecolors='k', label=f'Positive Class (1): {num_positive_class}')
plt.scatter(X[y == 0, 0], X[y == 0, 1], c='red', edgecolors='k', label=f'Negative Class (0): {num_negative_class}')
plt.title('Binary Class Distribution')
plt.xlabel('Feature 1')
plt.ylabel('Feature 2')
plt.legend(loc='best')
plt.show()
Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
Train a Random Forest classifier
model = RandomForestClassifier(random_state=42)
model.fit(X_train, y_train)
Predict probabilities on the test set
y_proba = model.predict_proba(X_test)[:, 1]
Calculate precision-recall curve
precision, recall, _ = precision_recall_curve(y_test, y_proba)
Calculate AUPRC
auprc = auc(recall, precision)
Calculate ROC curve
fpr, tpr, _ = roc_curve(y_test, y_proba)
Calculate AUC-ROC
roc_auc = roc_auc_score(y_test, y_proba)
Calculate Average Precision (AP) for the Random Forest model
ap_random_forest = average_precision_score(y_test, y_proba)
Calculate chance level AP
ap_chance_level = np.sum(y_test) / len(y_test)
Plot AUPRC, AUC-ROC, and AP in subplot 1x3
plt.figure(figsize=(18, 6))
Plot AUPRC
plt.subplot(1, 3, 1)
plt.plot(recall, precision, label=f'Random Forest Model (AP={ap_random_forest:.4f})', color='orange')
plt.xlabel('Recall (Positive class: 1)')
plt.ylabel('Precision (Positive class: 1)')
plt.title('Precision-Recall Curve')
plt.axhline(y=ap_chance_level, color='red', linestyle='--', label=f'Chance Level (AP={ap_chance_level:.4f})')
plt.legend(loc='best')
Plot AUC-ROC
plt.subplot(1, 3, 2)
plt.plot(fpr, tpr, label=f'AUC-ROC = {roc_auc:.4f}', color='green')
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('ROC Curve')
plt.legend(loc='best')
Plot AP with annotation
plt.subplot(1, 3, 3)
bars = plt.bar([0, 1, 2], [auprc, roc_auc, ap_random_forest], tick_label=['AUPRC', 'AUC-ROC', 'AP'], color=['orange', 'green', 'blue'])
plt.ylim(0, 1)
plt.title('AUPRC, AUC-ROC, and Average Precision (AP)')
Annotate values on top of bars with increased font size
for bar in bars:
yval = bar.get_height()
plt.text(bar.get_x() + bar.get_width()/2, yval, round(yval, 4), ha='center', va='bottom', fontsize=12)
plt.legend(loc='best')
plt.tight_layout()
plt.show()
I could not argue further but I also used LogisticRegression and changed the imbalance rate but the results didn't change and $AUPRC<AP$ (slightly).