I had also run into this problem several times, so I've created an open source modelstore Python library which seeks to tackle the problem of simplifying the best practices around versioning, storing, and downloading models from different cloud storage providers.
The modelstore library unifies the versioning and saving of an ML model into a single upload() command, and also provides a download() function to get that model back from storage. Here is (broadly) what it looks like - full documentation is available:
from modelstore import ModelStore
To save the model in s3
modelstore = ModelStore.from_aws_s3(os.environ["AWS_BUCKET_NAME"])
model, optim = train() # Replace with your code
Here's a pytorch example - the library currently supports 9 different ML frameworks
model_store.pytorch.upload(
"my-model-domain",
model=model,
optimizer=optim
)
The upload() command will create a tar archive containing your model and some meta-data about it, and upload it to a specific path in your storage.
You can later download the latest model by using:
model_path = modelstore.download(
local_path="/path/to/download/to", # Replace with a path
domain="my-model",
)
Note: there are options available like MLFlow's artifact storage which is great if you can set up and maintain a tracking server.