Cleaning Up Unused ENIs On AWS

Running out of IP addresses in your subnets is a real issue that most teams face these days. Most of the times those IPs are reserved but unused!

This blog will address the problem of Unused Elastic Network Interfaces and what to do to free up our IP …


This content originally appeared on DEV Community 👩‍💻👨‍💻 and was authored by Marc Naameh

Running out of IP addresses in your subnets is a real issue that most teams face these days. Most of the times those IPs are reserved but unused!

This blog will address the problem of Unused Elastic Network Interfaces and what to do to free up our IP address Pool for other services.

The Solution will consist of a Lambda function that gets triggered daily and a CloudWatch Alarm to alert us of any errors generated by our Lambda.

At first I will go through the solution and how to create it using the AWS Console, then I will be doing the same solution but using Infrastructure as Code (Terraform).

Creating the Lambda

Before Going through the code, let's set up some of the configuration parameters:

1- The most important one is the timeout, make sure that it is more than 1 minute (this will depend on your workloads)

2- Memory: 200 MB should be enough.

3- No need to put the lambda inside a VPC.

Lambda Code using python:
First, import AWS SDK for Python (Boto3):

PS: To find out more about about the AWS SDK, check out this link

import boto3

client = boto3.client('ec2')

Next Step: Import Subnets that have a specific tag.

key = type
value = private

There are multiple approaches for this. The way that I will be doing it is:
1- Describe all the resources based on tags
2- Add filters on resource type and tag (Key and Value)

# Get Subnets that have a Specific Tag.
tags = client.describe_tags(
    Filters = [
        {
        'Name' : 'resource-type',
        'Values' : [
            'subnet'
            ]
        },
        {
        'Name' : 'tag:type',
        'Values': [
            'private'
            ]
        } 
    ]
)

A formatted output of this method:

{
   "Tags":[
      {
         "Key":"type",
         "ResourceId":"subnet-062715a13f1fffa54",
         "ResourceType":"subnet",
         "Value":"private"
      },
      {
         "Key":"type",
         "ResourceId":"subnet-0ee66ce86ffe0c073",
         "ResourceType":"subnet",
         "Value":"private"
      }
   ],
   "ResponseMetadata":{}
}

Next, Get the subnet id from the result dictionary by parsing the data. Example:

# Get Subnets that have a Specific Tag.
list_subnets = []
i = 0
while i < len(tags['Tags']):
    list_subnets.append(tags['Tags'][i]['ResourceId'])
    i = i+1

Now that we have the subnet IDs, next step is to retrieve all the network Interfaces and delete them.

To narrow down our results to only the ones needed, Filters need to be added. those are:
1- Filter to get ENIs from specific subnets
2- Filter to get ENIs that are unused (Available)

NOTE that in the Value you need to put the Subnet ID that your retrieved before

eni = client.describe_network_interfaces(
    Filters=[
        {  
        'Name': 'subnet-id',
        'Values': [
            subnetid,
            ]       
        },
        {
        'Name': 'status',
        'Values': [
            'available'
            ]
        },
    ]
)
i = 0
while i < len(eni["NetworkInterfaces"]):
    network_interface = client.NetworkInterface(eni["NetworkInterfaces"][i]['NetworkInterfaceId'])
    network_interface.delete()
    i = i+1

For the handler we will have to call and sync the 2 previous functions for the lambda to work properly.


# Delete Available Network Interfaces in Specific Subnets
def lambda_handler(event, context): 
    list_subnet = get_tagged_subnets()
    i = 0
    while i < len(list_subnet):
        delete_available_eni(list_subnet[i])
        i = i+1


    return {
        "statusCode": 200,
    }

Role of the lambda

Specific Permission the lambda needs to have to function properly.

The IAM Policy following the least privilege principle is:

{
    "Statement": [
        {
            "Action": [
                "ec2:DescribeTags",
                "ec2:DescribeNetworkInterfaces",
                "ec2:DeleteNetworkInterface"
            ],
            "Effect": "Allow",
            "Resource": "*",
            "Sid": "1"
        }
    ],
    "Version": "2012-10-17"
}

Lambda Trigger

Invoking the lambda with Amazon EventBridge is divided into two steps:
1- Creating a rule that gets triggered every certain time
2- Assigning the the Lambda as a target for the rule

SNS for any errors

In case of any generated errors by the lambda, receiving an email to troubleshoot the error is a must.

3 AWS Services are needed:

  1. SNS Topic
  2. SNS Subscription
  3. CloudWatch Alarms

Creating the SNS Topic is very straightforward.
Select the Standard one, name it and leave the rest as default.

For the SNS Subscription it is even easier!
Select the topic you wish to subscribe to and the protocol and add your email!

For the CloudWatch Alarms, The screenshots below explain how to set them up:

Alarm1

Alarm2

Alarm3

Infrastructure as code

To benefit from Consistency, Speed, and decrease human error. Let's Deploy our infrastructure using Terraform:

Lambda Function:

PS: Your python Code named main.py would be under a src Directory in the same directory as your terraform project.

module "lambda_clean_eni" {
  source = "terraform-aws-modules/lambda/aws"

  function_name = format("clean_eni")
  description   = "Delete Unused Available ENIs in Subnets that contains EKS Clusters"
  handler       = "main.lambda_handler"
  runtime       = "python3.9"
  publish       = true
  role_name     = "Lambda-Clean-ENI"

  memory_size = 200
  timeout     = 600

  attach_cloudwatch_logs_policy = true

  attach_policy_jsons    = true
  number_of_policy_jsons = 1
  policy_jsons = [
    data.aws_iam_policy_document.clean_eni.json,
  ]

  source_path = "${path.module}/src"
  hash_extra  = filesha256("${path.module}/src/main.py")


  allowed_triggers = {
    EveryHourRule = {
      principal  = "events.amazonaws.com"
      source_arn = aws_cloudwatch_event_rule.clean_eni.arn
    }
  }

  attach_network_policy = true
}

Lambda IAM Role Policy:

data "aws_iam_policy_document" "clean_eni" {
  statement {
    sid = "1"
    actions = [
      "ec2:DeleteNetworkInterface",
      "ec2:DescribeNetworkInterfaces",
      "ec2:DescribeTags",
    ]
    effect    = "Allow"
    resources = ["*"]
  }
}

EventBridge:

resource "aws_cloudwatch_event_rule" "clean_eni" {
  name                = "Clean-Eni-Lambda-Rule"
  description         = "Fires once everyday"
  schedule_expression = "rate(1 day)"
}

resource "aws_cloudwatch_event_target" "clean_eni" {
  rule = aws_cloudwatch_event_rule.clean_eni.name
  arn  = module.lambda_clean_eni.lambda_function_arn
}

Cloudwatch Alarms:

module "alarm_lambda_clean_eni" {
  source  = "terraform-aws-modules/cloudwatch/aws//modules/metric-alarm"

  create_metric_alarm = true

  alarm_name                = "Lambda-clean-eni-error"
  alarm_description         = "Lambda error rate is too high"
  comparison_operator       = "GreaterThanOrEqualToThreshold"
  insufficient_data_actions = []
  evaluation_periods        = 1
  threshold                 = 1
  alarm_actions             = [aws_sns_topic.alarm_error.arn]

  metric_query = [{
    id          = "1"
    return_data = true
    label       = "Error Count"
    metric = [{
      namespace   = "AWS/Lambda"
      metric_name = "Errors"
      period      = 60
      stat        = "Sum"
      unit        = "Count"
      dimensions = {
        FunctionName = module.lambda_clean_eni.lambda_function_name
      }
    }]
  }]
}

SNS Topic and Subscription:

resource "aws_sns_topic" "alarm-error" {
  name = "alarm-error"
}

resource "aws_sns_topic_subscription" "alarm-error-sub" {
  topic_arn = aws_sns_topic.alarm-error.arn
  protocol = "email"
  endpoint = "your@email.com"  
}

Summary: With this Solution, we are able to to successfully mitigate the problem of running out of IP addresses in our subnets.

Side Note: You can customize your filters however you like, so feel free to explore how to make them fit your environment!


This content originally appeared on DEV Community 👩‍💻👨‍💻 and was authored by Marc Naameh


Print Share Comment Cite Upload Translate Updates
APA

Marc Naameh | Sciencx (2023-01-17T19:51:58+00:00) Cleaning Up Unused ENIs On AWS. Retrieved from https://www.scien.cx/2023/01/17/cleaning-up-unused-enis-on-aws/

MLA
" » Cleaning Up Unused ENIs On AWS." Marc Naameh | Sciencx - Tuesday January 17, 2023, https://www.scien.cx/2023/01/17/cleaning-up-unused-enis-on-aws/
HARVARD
Marc Naameh | Sciencx Tuesday January 17, 2023 » Cleaning Up Unused ENIs On AWS., viewed ,<https://www.scien.cx/2023/01/17/cleaning-up-unused-enis-on-aws/>
VANCOUVER
Marc Naameh | Sciencx - » Cleaning Up Unused ENIs On AWS. [Internet]. [Accessed ]. Available from: https://www.scien.cx/2023/01/17/cleaning-up-unused-enis-on-aws/
CHICAGO
" » Cleaning Up Unused ENIs On AWS." Marc Naameh | Sciencx - Accessed . https://www.scien.cx/2023/01/17/cleaning-up-unused-enis-on-aws/
IEEE
" » Cleaning Up Unused ENIs On AWS." Marc Naameh | Sciencx [Online]. Available: https://www.scien.cx/2023/01/17/cleaning-up-unused-enis-on-aws/. [Accessed: ]
rf:citation
» Cleaning Up Unused ENIs On AWS | Marc Naameh | Sciencx | https://www.scien.cx/2023/01/17/cleaning-up-unused-enis-on-aws/ |

Please log in to upload a file.




There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.