Model Deployment - Creating & Serving APIs
You’ve successfully built a data processing pipeline, trained a model, and rigorously evaluated its performance. These are critical milestones. However, a machine learning model only delivers true value when it’s deployed - made accessible to end-users or other systems to make predictions on new, unseen data. Moving a model from a development environment (like a Jupyter notebook or local scripts) into a robust, scalable, and reliable production setting presents a distinct set of engineering challenges.
This tutorial bridges that gap. We will take the Bank Marketing prediction model, along with its associated preprocessing pipeline, and deploy it as a fully functional REST API. You’ll learn how to package your application using Docker for consistency and portability. Finally, we’ll walk through deploying this containerized API to Amazon Web Services (AWS) using Amazon Elastic Container Service (ECS), making your model accessible over the internet.
The focus is on establishing a practical, end-to-end deployment workflow, covering essential steps from API creation to cloud deployment and basic testing. This is where your AI/ML engineering skills culminate in delivering real-world impact.
Tutorial Goals
- Build a REST API for your model using FastAPI
- Containerize the API application and its dependencies using Docker
- Set up an S3 bucket for DVC remote storage and push artifacts
- Deploy the containerized API to AWS ECS using EC2 instances
- Test the deployed API endpoint
- Learn to tear down AWS resources to manage costs