Model as a service up and running in AWS
How to deploy machine learning model from scratch

Code is available here.

The goal of this project is to be some kind of tutorial to get up a cluster in AWS to run a predictive model as a REST service. Besides that, the stacking of this tutorial allows have multiple version of the model which is great to retrain and do A/B test. In the end of Step by step you’ll have a endpoint of cluster of machines in AWS to run the model and access the results. The request’s inputs and model’s outputs will be also available in a postgres database.

The entire stack is configured to run according to the amazon’s free tier rules, so if still have free tier, you won’t be charged.

The scripts will use mainly:

The web framework in this tutorial is Flask and the model running is a simple model using iris dataset, but could be other as long it respects the contracts with the database and endpoints. That means, changing the web framework and model or even the programming language will not change the way that the service is deployed.

Overview of AWS stack

stack

Each model version has a stack that contains:

For each model version is created 2 tables in the database, one with the inputs of the requests and another with the outputs identified by {inputs_version, outputs_version}. The entire stack with be on your default VPC that is created for every zones when you open an account. Suppose that you want to deploy a model but before you need to do an A/B test comparing the old version with the new one, so you’ll need to keep up 2 stacks and this infra allows scale up each model independently.

Main Requirements

The requeriments described here is to setup a cluster in AWS, but for others minor tasks the requeriments will be given in each section.

1. AWS account

Create a free account in AWS under the free tier.

2. Local Virtual Machine

If don’t hold a Linux please download and install VirtualBox

This is required to build, use and manage docker, which is a way to package software to run anywhere.

3. Docker

For MacOS:

$ brew install docker
$ brew install docker-machine

Check if they are installed successfully:

$ docker --version
$ docker-machine --version

If you don’t hold a Linux, you also have to create a default machine to use docker, so run:

$ docker-machine create
$ docker-machine ls

4. Python 3.6

Install python to train your model.

$ brew install python3

5. AWS command line

In order to install AWS command line, you have to install first python and then you can install awscli through pip. Python’s version is not important in this case but I would recommend use python3.6.

For MacOS:

$ pip3 install awscli

6. jq

jq is a useful tool to manipulate json and as awscli will always return the results in json format it will helpful to parse them. For MacOS:

$ brew install jq

Step by step

1. Setup of environment variables

In order to run this project you will need to setup some environment variables as AWS keys, user and password . The script below should be filled and saved in /secrets/env-variables.sh.

#!/usr/bin/env bash

export AWS_ACCESS_KEY=XXXXXXXXXXXXXXXXXX
export AWS_SECRET_KEY=XXXXXXXXXXXXXXXXXX
export AWS_DEFAULT_REGION=XXXXXXXXXXXXXXXXXX
export DBMASTERUSER=XXXXXXXXXXXXXXXXXX
export DBPASSWORD=XXXXXXXXXXXXXXXXXX
export ARTIFACTS_AWS_ACCESS_KEY_ID=$AWS_ACCESS_KEY
export ARTIFACTS_AWS_SECRET_ACCESS_KEY=$AWS_SECRET_KEY
export AWS_SECRET_ACCESS_KEY=$AWS_SECRET_KEY
export AWS_ACCESS_KEY_ID=$AWS_ACCESS_KEY

export ECS_STACK_PREFIX=modelapp
export DOCKER_REPOSITORY_NAME=modelapp
export KEY_VALUE_PAIR_NAME=modelapp-key
export DB_NAME=modelapp

You must load the the variables above before start running the project:

$ . ./secrets/env-variables.sh

If you don’t have AWS key you can check how to get it here

2. Bootstrap

To run the Flask application, we will need to:

The script below will make sure that everything has been created and if is not, will create.

$ bash bootstrap.sh -r $DOCKER_REPOSITORY_NAME \
                    -k $KEY_VALUE_PAIR_NAME \
                    -d $DB_NAME \
                    -u $DBMASTERUSER \
                    -p $DBPASSWORD

The database creation might take some minutes, so just jump to the next step if the database creation is completed.

You can check the stack’s status logging here and then go to services > cloudformation and wait until the Status of the stack db-modelapp become CREATE_COMPLETE.

The requirements to run it are {1, 4, 5}.

3. Deploying your application

Make sure that your VM is running or run it to start the VM:

$ docker-machine start default

The script below will:

$ bash train-and-deploy-it.sh -r $DOCKER_REPOSITORY_NAME \
                              -k $KEY_VALUE_PAIR_NAME \
                              -d $DB_NAME \
                              -u $DBMASTERUSER \
                              -p $DBPASSWORD

If you made some change in the code of your application, for instance, you included one variable more then you should fix the tests, bump the VERSION and then deploy-it.sh again.

The dependencies required to run it are {1, 2, 3, 4, 5}.

4. Getting Endpoint (DNS)

Get the URL or endpoint of your application.

$ aws cloudformation describe-stacks --stack-name modelapp-`cat VERSION` | jq '.Stacks[].Outputs[].OutputValue'

How to access your deployed model

Sending requests to the application

Install the library requests:

$ pip3 install requests==2.18.4

and then run you can run the script to make requests:


$ python3 resources/populate_db.py http://my-endpoint/

If you are running locally (non-dockerized or dockerized), the endpoint should be localhost:8080/run/. If you are running in AWS, get the endpoint in the step 4 in Step by step.

Running the application adhoc

Sometimes is painful spin a stack just to run some tests or retrain your model. There are two ways that you can test your application, the first one is activate the virtualenv and then running the application and the other is building the docker image and running the container with the application.

The first option:

$ bash modelapp/bootstrap-python-env.sh
$ . modelapp-python-env/bin/activate 
$ python3 modelapp/src/modelapp/manage.py runserver --host 0.0.0.0  --port=8080

The second option:

$ docker build -t mytest .
$ docker run -it -p 8080:8080 mytest

to check if they are actually running curl localhost:8080 should return 200.

Trainning and running the application adhoc

If you made some changes in the model and would like to train and run to test:

$ bash adhoc-train-and-run.sh

Running tests

$ bash modelapp/bootstrap-python-env.sh
$ . modelapp-python-env/bin/activate 
$ py.test

Reverting a deployed version

In order to revert the deploy of some buggy code it will need install one more dependency to delete the tables created (psql). For MacOS:

$ brew install postgresql

The script below will:

*****
Written by Liamarcia Bifano on 24 October 2017