3

I'm using AWS Sage Maker to build my model. I want to store the model in S3 for later use. How do you save your model in S3 with Amazon Sage Maker? I know this seems trivial but I didn't understand the sources/documentation I've read.

Pluviophile
  • 4,203
  • 14
  • 32
  • 56
Laurent
  • 53
  • 1
  • 4

2 Answers2

2

You can use pickle (or any other format to serialize your model) and boto3 library to save your model to s3.

To save your model as a pickle file you can use:

import pickle
import numpy as np

from sklearn.linear_model import LinearRegression

X = np.array([[1, 1], [1, 2], [2, 2], [2, 3]]) y = np.dot(X, np.array([1, 2])) + 3

model = LinearRegression().fit(X, y)

save the model to disk

pkl_filename = 'pickle_model.pkl' with open(pkl_filename, 'wb') as file: pickle.dump(model, file)

and to save your model as a pickle file to s3, rather than the sagemaker's local:

# to save the model to s3
import boto3

For aws credentials, if ~/.aws/credentials is missing

access_key_id = '...'

secret_access_key = '...'

session = boto3.Session(

aws_access_key_id=access_key_id ,

aws_secret_access_key=secret_access_key,)

s3_resource = session.resource('s3')

s3_resource = boto3.resource('s3')

bucket='your_bucket' key= 'pickle_model.pkl'

pickle_byte_obj = pickle.dumps(model)

s3_resource.Object(bucket,key).put(Body=pickle_byte_obj)

0

To expand on the other answer: this is a problem that I've run into several times myself, and so I've built an open source modelstore library that automates this step - as well as doing other things like versioning the model, and storing it in s3 with structured paths.

The code to use it looks like this (there is a full example here):

from modelstore import ModelStore

Train your model, as usual

model = LinearRegression() model.fit(X, y)

Create a model store that points to your s3 bucket

bucket_name = "your-bucket-name" modelstore = ModelStore.from_aws_s3(bucket_name)

Upload your model

model_domain = "your-model-domain" modelstore.sklearn.upload(model_domain, model=model)

This will dump your model to a file, create a tar archive from it, and then upload that to s3 for you. The function returns some meta-data as a dictionary; this includes the version ID for your model.

neal
  • 231
  • 1
  • 3