0

I have a ML model (trained in Sklearn) and based on it I have created a Flask web service and hosted it on Windows IIS server.

What is the best practice to load the model? Shall I load the model when we start the API or model should be loading when the request coming?

Case1

import flask
import joblib

app = Flask(name)

load the models

MODELS = joblib.load(model_file)

endpoints

@app.route("/predictions", methods=["GET", "POST"]) def predictions():

some code

case2

import flask
import joblib

app = Flask(name)

endpoints

@app.route("/predictions", methods=["GET", "POST"]) def predictions():

load the model

model = joblib.load(model_file)

Sociopath
  • 1,293
  • 2
  • 12
  • 27

1 Answers1

1

If you load the time every time you have an incoming request, you would be increasing the latency. Usually, you want to minimize request latency, so it would be better to load the model at the beginning and just use it when you fulfil the request. This approach also saves unnecessary duplication.

noe
  • 28,203
  • 1
  • 49
  • 83