AWS EKS ML Model Deployment

Production-Grade Machine Learning Model Deployment on AWS EKS

Deploying and serving a machine-learning inference API on AWS EKS using managed Kubernetes with production-ready networking, scaling, and access control. This project demonstrates the complete workflow from local development to production deployment on AWS Elastic Kubernetes Service.

Project Summary

Comprehensive Project Overview

Project Category

MLOps - DevOps - Cloud (AWS EKS)

Industry/Domain

Cloud Computing & Artificial Intelligence Infrastructure

Domain Focus

Production Kubernetes (EKS)-Based Machine Learning Model Deployment & Serving

Key Technologies & Concepts

Core Technologies Used

AWS EKS & Kubernetes Keywords

AWS EKS (Managed Kubernetes) Kubernetes (Deployment, Service, Pod) Minikube (Local Kubernetes) Kubectl CLI (Cluster Interaction) Containerized ML Inference Docker Image (Docker Hub) Kubernetes YAML Configuration AWS IAM (Authentication & RBAC) Service Types (LoadBalancer, ClusterIP) Namespace Management Resource Limits (CPU, Memory) ReplicaSet & Pod Lifecycle Infrastructure as Code (Declarative Manifests)

Problem & Objective

What problem did this project solve?

Problems Solved

Deploying machine-learning models in production requires reliable orchestration, secure access, scalable infrastructure, and managed control planes
Moving from local/development Kubernetes deployments to a cloud-managed, production-ready Kubernetes platform
Ensuring stable ML model serving, secure cluster access via IAM, and cloud-native networking using AWS EKS

Primary Objectives

Deploy and serve a machine-learning inference model in a production-grade, managed Kubernetes environment (AWS EKS)
Validate secure cluster access, scalable workload management, and cloud-native networking
Maintain consistency with Kubernetes best practices used in development environments

Solution & Architecture

Architectural Overview

Solution Overview

The solution deploys a containerized machine-learning inference application on AWS EKS, using Kubernetes Deployments for workload management and Services for controlled access. The EKS managed control plane handles cluster orchestration, while EC2 worker nodes run the application Pods.

Secure access is enforced through AWS IAM-integrated authentication, and scalability, reliability, and rolling updates are managed natively by Kubernetes, resulting in a production-ready ML model serving architecture.

The application is deployed using Kubernetes Deployments, enabling horizontal scaling by adjusting replica counts. AWS EKS provides a highly available, managed control plane, while Kubernetes ensures self-healing by automatically replacing failed Pods.

AWS EKS ML Deployment Architecture Diagram

1

Local Development

2

Docker Containerization

3

AWS EKS Cluster

4

Kubernetes Deployment

5

Load Balancer Service

Key Components

AWS EKS: Managed Kubernetes control plane
EC2 Worker Nodes: Managed Node Groups
Kubernetes Deployment: ML inference workload management
Kubernetes Service: ClusterIP / LoadBalancer for access
AWS IAM: Authentication & RBAC integration
Amazon VPC: Networking, subnets, security groups
Docker: Containerized ML inference image
Container Registry: Docker Hub / Amazon ECR

Skills & Technologies Used

Technical Proficiency Demonstrated

Primary Skills

AWS EKS (Managed Kubernetes Operations) - Intermediate
Kubernetes Deployment & Service Management - Intermediate
Production ML Model Serving on Kubernetes - Intermediate
Cloud-Native Networking & Load Balancing - Intermediate
IAM-Based Authentication & RBAC Integration - Intermediate
Infrastructure as Code (Kubernetes YAML Manifests) - Intermediate

Secondary Tools / Frameworks

Python (ML inference application)
Flask / FastAPI (Model serving API)
Docker Hub / Amazon ECR (Image storage & retrieval)
AWS CLI (EKS and IAM interaction)
Linux Shell (Operational commands & debugging)

Programming Languages

Infrastructure as Code YAML configuration file for Deployments and services
Python for ML inference application
GitHub CLI Commands
Kubectl CLI
Eksctl CLI

Cloud & DevOps Tools

Amazon EKS Amazon EC2 Amazon VPC AWS IAM AWS CLI Kubectl Docker

Challenges & Outcomes

Technical challenges faced and resolutions

Key Technical Challenges

Configuring kubectl access to a managed EKS control plane, including proper kubeconfig setup and IAM authentication
Understanding the separation between managed control plane and worker nodes in AWS EKS compared to local Kubernetes environments
Exposing the ML inference service securely using AWS-integrated Kubernetes Services without direct access to master nodes
Ensuring reliable deployment behavior and debugging Pods in a cloud-based Kubernetes environment with stricter networking and security controls

How They Were Resolved

Kubernetes access issues were resolved by correctly configuring kubeconfig using aws eks update-kubeconfig, allowing kubectl to communicate with the EKS API server through IAM-authenticated requests
The EKS architecture was understood and applied by relying on AWS-managed control plane services and focusing operational tasks on worker nodes and Kubernetes abstractions
Service exposure challenges were addressed using AWS-integrated Kubernetes Service types, enabling controlled external access through managed load balancers
Deployment and runtime issues were diagnosed using kubectl logs, describe, and rollout commands, ensuring stable model serving and enabling quick recovery through rollbacks

Scalability & Reliability Considerations

The application is deployed using Kubernetes Deployments, enabling horizontal scaling by adjusting replica counts. AWS EKS provides a highly available, managed control plane, while Kubernetes ensures self-healing by automatically replacing failed Pods. Rolling update strategies allow model version upgrades without downtime, and cloud-native networking via AWS Load Balancers ensures reliable external access to the inference service.

Kubernetes Architecture & YAML Mapping

Architecture to YAML construct mapping

Architecture Block	Kubernetes YAML Construct
Client (Browser / Postman)	External consumer (outside cluster)
API Entry Point	Service
Service Type	spec.type: LoadBalancer
Service Port	spec.ports.port: 80
Target Container Port	spec.ports.targetPort: 9696
Traffic Routing	spec.selector
Stable Virtual IP	Service abstraction
Workload Controller	Deployment
Pod Lifecycle Management	Deployment
Pod Template	spec.template
Pod Labels	spec.template.metadata.labels
Selector Matching	spec.selector.matchLabels
Container Definition	spec.template.spec.containers
Container Image	containers.image
Resource Limits	containers.resources.limits
Application Port	containers.ports.containerPort
Self-Healing	Deployment (ReplicaSet)

Code Examples & Configuration

Key YAML configurations and commands

LoadBalancer Service YAML

apiVersion: v1
kind: Service
metadata:
  name: recruitment-rank-app
spec:
  type: LoadBalancer
  selector:
    app: recruitment-rank-app
  ports:
    - protocol: "TCP"
      port: 80
      targetPort: 9696

Deployment YAML

apiVersion: apps/v1
kind: Deployment
metadata:
  name: recruitment-rank-app
spec:
  selector:
    matchLabels:
      app: recruitment-rank-app
  template:
    metadata:
      labels:
        app: recruitment-rank-app
    spec:
      containers:
        - name: placement-app
          image: 03sarah/recruitment-rank-app:v1
          resources:
            limits:
              memory: "128Mi"
              cpu: "500m"
          ports:
            - containerPort: 9696

Key Commands Used

# Create EKS cluster
eksctl create cluster --name mlops-cluster --version 1.31 --region us-east-1 \
  --zones=us-east-1a,us-east-1b,us-east-1c,us-east-1d \
  --nodegroup-name linux-nodes --node-type t2.medium --nodes 2

# Update kubeconfig
aws eks update-kubeconfig --region us-east-1 --name mlops-cluster

# Apply configurations
kubectl create -f app-deployment.yaml
kubectl create -f loadbalancer.yaml

# Check resources
kubectl get all
kubectl get nodes
kubectl describe pod <pod-name>

# Delete cluster
eksctl delete cluster --name <cluster-name>

Assets & References

Code, diagrams, study material

GitHub Repository

Source code repository containing deployment scripts, configurations, and documentation.

Access Repository

Study Material Resources

Click the button below to open the study materials

Request Study Material