☸️ How to deploy a cost-efficient AWS/EKS Kubernetes cluster using Terraform in 2023

Introduction
Variables
Providers and locals
The enclosing VPC network
The actual Kubernetes cluster
Docker registries
S3 application bucket
Outputs
Conclusion

Cover image generated locally by DiffusionBee with ToonYou model. Prompt was purple exca…


This content originally appeared on DEV Community and was authored by Benoît COUETIL

  • Introduction
  • Variables
  • Providers and locals
  • The enclosing VPC network
  • The actual Kubernetes cluster
  • Docker registries
  • S3 application bucket
  • Outputs
  • Conclusion

Cover image generated locally by DiffusionBee with ToonYou model. Prompt was purple excavator, smoke, best quality, masterpiece, large clouds

Introduction

Terraform is an infrastructure-as-code tool that lets you build, change, and version cloud and on-prem resources safely and efficiently.

An AWS spot instance is an instance that uses spare EC2 capacity that is available for less than the On-Demand price. Because Spot Instances enable you to request unused EC2 instances at steep discounts, you can lower your Amazon EC2 costs significantly. And Kubernetes is a perfect candidate to use unstable virtual machines.

Surprisingly, complete Terraform examples using multiple kinds of spot instances are hard to find on the internet, since EKS Module version 18 parameters rework. This blog post is there to fill that hole.

We detail here how to deploy, using Terraform, an EKS Cluster with the following characteristics :

  • A VPC network with private and public subnets using a gateway
  • A Kubernetes cluster using mixed type spot instances
  • Docker registries

For below file blocks to work, you need to know basics of Terraform, and have aws-cli configured on a profile named after the cluster. But if you prefer sticking to the default aws profile, remove --profile pieces of code below and everything will be fine.

Variables

Here are variables used by most resources described below. All are simple values except var.aws_auth_users

variable "region" {
  description = "Cluster region"
  default = "eu-west-x"
}

variable "cluster_name" {
  description = "Name of the EKS cluster"
  default     = "my-project"
}

variable "kubernetes_version" {
  description = "Cluster Kubernetes version"
  default     = "1.24"
}

# Being in this list is required to see Kubernetes resources in AWS console
variable "aws_auth_users" {
  description = "Developers with access to the dev K8S cluster and the container registries"
  default = [
    {
      userarn  = "arn:aws:iam::xxx:user/user.name1"
      username = "user.name1"
      groups   = ["system:masters"]
    },
    {
      userarn  = "arn:aws:iam::xxx:user/user.name2"
      username = "user.name2"
      groups   = ["system:masters"]
    }
  ]
}

Providers and locals

Let's set providers and common tags.

provider "aws" {
  region = var.region
  # prerequisite locally : aws configure --profile <cluster-name>
  profile = var.cluster_name
}

provider "kubernetes" {
  host                   = module.eks.cluster_endpoint
  cluster_ca_certificate = base64decode(module.eks.cluster_certificate_authority_data)

  exec {
    api_version = "client.authentication.k8s.io/v1beta1"
    command     = "aws"
    # This requires the awscli to be installed locally where Terraform is executed
    args = ["--profile", var.cluster_name, "eks", "get-token", "--cluster-name", var.cluster_name]
  }
}

locals {
  tags = {
    Environment   = "NON-PROD"
    creation-date = "01/02/2023" # a variable would update the value on each tf apply
  }
}

data "aws_caller_identity" "current" {}

The enclosing VPC network

Here is a sample Terraform block for the VPC network where the Kubernetes cluster will be created.

A few notes :

  • a secured network is composed of a private and public subnets, one in each availability zone of the chosen region.
  • be careful to have enough IP available in ranges for your needs ; here it can fit 8,192 IPs
module "vpc" {

  source  = "terraform-aws-modules/vpc/aws"
  version = "~> 3.0"
  name    = var.cluster_name
  cidr    = "10.0.0.0/16" # Last IP : 10.0.255.255
  azs     = ["${var.region}a", "${var.region}b", "${var.region}c"]
  # use https://www.ipaddressguide.com/cidr
  # /19 : 8,192 IPs
  private_subnets      = ["10.0.0.0/19", "10.0.32.0/19", "10.0.64.0/19"]    # No hole in IP ranges
  public_subnets       = ["10.0.96.0/19", "10.0.128.0/19", "10.0.160.0/19"] # No hole in IP ranges
  enable_nat_gateway   = true
  single_nat_gateway   = true
  enable_dns_hostnames = true

  public_subnet_tags = {
    "kubernetes.io/cluster/${var.cluster_name}" = "shared"
    "kubernetes.io/role/elb"                    = "1"
  }

  private_subnet_tags = {
    "kubernetes.io/cluster/${var.cluster_name}" = "shared"
    "kubernetes.io/role/internal-elb"           = "1"
  }

  tags = local.tags

}

The actual Kubernetes cluster

Now defining the main dish : the Kubernetes cluster.

A few notes :

  • The cluster addons used are mandatory for the cluster to work properly
  • The coredns addon may fail on first apply, just apply again (the timeout is customized to avoid waiting for too long)
  • AWS users with access right to the cluster are defined in var.aws_auth_users, an example is show later in this article
  • Security groups are simplified, and may be adjusted for your security needs :
    • no access from internet (except using an ingress controller)
    • full access node to node
    • full access from nodes to internet
  • EC2 used for workers are t3 and t3a spot instances. Mixed instance types ensure no starvation of nodes when AWS runs out of one type
  • A commented out fulltime-az-a would allow to create also on-demands instances if uncommented and adapted to your needs
  • Nodes are created only in one availability zone. In a production environment, use at least 2 availability zones, by creating a spot-az-b similar to spot-az-a. Zone to zone network is not free and, in this example, development environment potential downtime is acceptable
module "eks" {

  source = "terraform-aws-modules/eks/aws"

  cluster_name                    = var.cluster_name
  cluster_version                 = var.kubernetes_version
  cluster_endpoint_private_access = true
  cluster_endpoint_public_access  = true

  cluster_addons = {
    coredns = {
      most_recent = true

      timeouts = {
        create = "2m" # default 20m. Times out on first launch while being effectively created
      }
    }
    kube-proxy = {
      most_recent = true
    }
    vpc-cni = {
      most_recent = true
    }
    aws-ebs-csi-driver = {
      most_recent = true
    }
  }

  vpc_id     = module.vpc.vpc_id
  subnet_ids = module.vpc.private_subnets

  # Self managed node groups will not automatically create the aws-auth configmap so we need to
  create_aws_auth_configmap = true
  manage_aws_auth_configmap = true

  aws_auth_users = var.aws_auth_users

  enable_irsa = true

  node_security_group_additional_rules = {
    ingress_self_all = {
      description = "Node to node all ports/protocols"
      protocol    = "-1"
      from_port   = 0
      to_port     = 0
      type        = "ingress"
      self        = true
    }
    egress_all = { # by default, only https urls can be reached from inside the cluster
      description      = "Node all egress"
      protocol         = "-1"
      from_port        = 0
      to_port          = 0
      type             = "egress"
      cidr_blocks      = ["0.0.0.0/0"]
      ipv6_cidr_blocks = ["::/0"]
    }
  }

  self_managed_node_group_defaults = {

    # enable discovery of autoscaling groups by cluster-autoscaler
    autoscaling_group_tags = {
      "k8s.io/cluster-autoscaler/enabled" : true,
      "k8s.io/cluster-autoscaler/${var.cluster_name}" : "owned",
    }

    # from https://github.com/terraform-aws-modules/terraform-aws-eks/issues/2207#issuecomment-1220679414
    # to avoid "waiting for a volume to be created, either by external provisioner "ebs.csi.aws.com" or manually created by system administrator"
    iam_role_additional_policies = {
      AmazonEBSCSIDriverPolicy = "arn:aws:iam::aws:policy/service-role/AmazonEBSCSIDriverPolicy"
    }

  }

  # possible values : https://github.com/terraform-aws-modules/terraform-aws-eks/blob/master/node_groups.tf
  self_managed_node_groups = {

    default_node_group = {
      create = false
    }

    # fulltime-az-a = {
    #   name                 = "fulltime-az-a"
    #   subnets              = [module.vpc.private_subnets[0]]
    #   instance_type        = "t3.medium"
    #   desired_size         = 1
    #   bootstrap_extra_args = "--kubelet-extra-args '--node-labels=node.kubernetes.io/lifecycle=normal'"
    # }

    spot-az-a = {
      name       = "spot-az-a"
      subnet_ids = [module.vpc.private_subnets[0]] # only one subnet to simplify PV usage
      # availability_zones = ["${var.region}a"] # conflict with previous option. TODO try subnet_ids=null at creation (because at modification it fails)

      desired_size         = 2
      min_size             = 1
      max_size             = 10
      bootstrap_extra_args = "--kubelet-extra-args '--node-labels=node.kubernetes.io/lifecycle=spot'"

      use_mixed_instances_policy = true
      mixed_instances_policy = {
        instances_distribution = {
          on_demand_base_capacity                  = 0
          on_demand_percentage_above_base_capacity = 0
          spot_allocation_strategy                 = "lowest-price" # "capacity-optimized" described here : https://aws.amazon.com/blogs/compute/introducing-the-capacity-optimized-allocation-strategy-for-amazon-ec2-spot-instances/
        }

        override = [
          {
            instance_type     = "t3.xlarge"
            weighted_capacity = "1"
          },
          {
            instance_type     = "t3a.xlarge"
            weighted_capacity = "1"
          },
        ]
      }

    }

  }

  tags = local.tags
}

Docker registries

Most of us deploy Kubernetes clusters for custom applications, so here are AWS docker registries blocks.

Using AWS, Kubernetes and Docker registries integrate flawlessly, no need for other configuration.

resource "aws_ecr_repository" "module-a" {
  name = "my-app/module-a"
}

resource "aws_ecr_repository" "module-b" {
  name = "my-app/module-b"
}

resource "aws_ecr_repository" "module-c" {
  name = "my-app/module-c"
}

S3 application bucket

More often than not, projects needs a S3 bucket to store files, so here is code for a secured S3 bucket, with access from the backend kubernetes service account.

resource "aws_s3_bucket" "bucket" {
  bucket = "${var.cluster_name}-bucket"

  tags = local.tags
}

resource "aws_s3_bucket_acl" "bucket_acl" {
  bucket = aws_s3_bucket.bucket.id
  acl    = "private"
}

data "aws_iam_policy_document" "role_policy" {
  statement {
    actions = ["sts:AssumeRoleWithWebIdentity"]
    effect  = "Allow"

    condition {
      test     = "StringLike"
      variable = "${replace(module.eks.cluster_oidc_issuer_url, "https://", "")}:sub"
      values   = ["system:serviceaccount:*:backend"] # system:serviceaccount:<K8S_NAMESPACE>:<K8S_SERVICE_ACCOUNT>
    }

    principals {
      identifiers = ["arn:aws:iam::${var.aws_account_id}:oidc-provider/${replace(module.eks.cluster_oidc_issuer_url, "https://", "")}"]
      type        = "Federated"
    }
  }
}

data "aws_iam_policy_document" "s3_policy" {
  statement {
    actions = [
      "s3:ListAllMyBuckets",
    ]

    resources = [
      "*",
    ]
  }
  statement {
    actions = [
      "s3:*",
    ]

    resources = [
      aws_s3_bucket.bucket.arn,
      "${aws_s3_bucket.bucket.arn}/*"
    ]
  }
}

resource "aws_iam_role" "role" {
  assume_role_policy = data.aws_iam_policy_document.role_policy.json
  name               = "${var.cluster_name}-backend-role"
}

resource "aws_iam_policy" "policy" {
  name   = "${var.cluster_name}-backend-policy"
  path   = "/"
  policy = data.aws_iam_policy_document.s3_policy.json
}

resource "aws_iam_role_policy_attachment" "attach" {
  policy_arn = aws_iam_policy.policy.arn
  role       = aws_iam_role.role.name
}

You need also to :

  • Create or use a kubernetes service account (backend in this example) used by your pods using the bucket
    • on your deployment, set serviceAccount: backend
  • Annotate the kubernetes service account with :
    • eks.amazonaws.com/role-arn: arn:aws:iam::<AWS_ACCOUNT_ID>:role/<CLUSTER_NAME>-backend-role

Outputs

No need for any specific output, especially kube.config, since aws-cli can configure access to cluster. But for convenience, the aws-cli command is printed as an output.

output "update_local_context_command" {
  description = "Command to update local kube context"
  value       = "aws --profile ${var.cluster_name} eks update-kubeconfig --name=${var.cluster_name} --alias=${var.cluster_name} --region=${var.region}"
}

Conclusion

Getting all these pieces of Terraform code together, you should be able to deploy a cluster in one command under 20 minutes.

If you think some code should be improved, please advise in the comments 🤓


This content originally appeared on DEV Community and was authored by Benoît COUETIL


Print Share Comment Cite Upload Translate Updates
APA

Benoît COUETIL | Sciencx (2023-06-03T20:18:17+00:00) ☸️ How to deploy a cost-efficient AWS/EKS Kubernetes cluster using Terraform in 2023. Retrieved from https://www.scien.cx/2023/06/03/%e2%98%b8%ef%b8%8f-how-to-deploy-a-cost-efficient-aws-eks-kubernetes-cluster-using-terraform-in-2023/

MLA
" » ☸️ How to deploy a cost-efficient AWS/EKS Kubernetes cluster using Terraform in 2023." Benoît COUETIL | Sciencx - Saturday June 3, 2023, https://www.scien.cx/2023/06/03/%e2%98%b8%ef%b8%8f-how-to-deploy-a-cost-efficient-aws-eks-kubernetes-cluster-using-terraform-in-2023/
HARVARD
Benoît COUETIL | Sciencx Saturday June 3, 2023 » ☸️ How to deploy a cost-efficient AWS/EKS Kubernetes cluster using Terraform in 2023., viewed ,<https://www.scien.cx/2023/06/03/%e2%98%b8%ef%b8%8f-how-to-deploy-a-cost-efficient-aws-eks-kubernetes-cluster-using-terraform-in-2023/>
VANCOUVER
Benoît COUETIL | Sciencx - » ☸️ How to deploy a cost-efficient AWS/EKS Kubernetes cluster using Terraform in 2023. [Internet]. [Accessed ]. Available from: https://www.scien.cx/2023/06/03/%e2%98%b8%ef%b8%8f-how-to-deploy-a-cost-efficient-aws-eks-kubernetes-cluster-using-terraform-in-2023/
CHICAGO
" » ☸️ How to deploy a cost-efficient AWS/EKS Kubernetes cluster using Terraform in 2023." Benoît COUETIL | Sciencx - Accessed . https://www.scien.cx/2023/06/03/%e2%98%b8%ef%b8%8f-how-to-deploy-a-cost-efficient-aws-eks-kubernetes-cluster-using-terraform-in-2023/
IEEE
" » ☸️ How to deploy a cost-efficient AWS/EKS Kubernetes cluster using Terraform in 2023." Benoît COUETIL | Sciencx [Online]. Available: https://www.scien.cx/2023/06/03/%e2%98%b8%ef%b8%8f-how-to-deploy-a-cost-efficient-aws-eks-kubernetes-cluster-using-terraform-in-2023/. [Accessed: ]
rf:citation
» ☸️ How to deploy a cost-efficient AWS/EKS Kubernetes cluster using Terraform in 2023 | Benoît COUETIL | Sciencx | https://www.scien.cx/2023/06/03/%e2%98%b8%ef%b8%8f-how-to-deploy-a-cost-efficient-aws-eks-kubernetes-cluster-using-terraform-in-2023/ |

Please log in to upload a file.




There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.