Today I spent some time working with AWS CLI focusing on Cloudformation and how to deal with configuration drift.

Here’s some notes about the topic.

My setup

I run docker in Windows 10 and the docker client inside WSL. That means the docker engine runs on Windows 10 but all the commands are executed in WSL.

This setup is not mandatory but I prefer to work in bash than in Windows terminal.

WSL needs a FIX to work with volumes and Docker.

Start installing docker in Windows 10 and expose the daemon.

Install docker inside WSL and then run it as non-root user:

sudo apt install docker.io
sudo apt install docker-compose
sudo usermod -aG docker $USERsudo usermod -aG docker $USER
echo "export DOCKER_HOST=tcp://localhost:2375" >> ~/.bashrc && source ~/.bashrc

When docker is installed we can install the AWS cli docker image running

docker run --rm -it amazon/aws-cli --version

Create the a folder to share files with the container. Note the folder is in the Windows file system.

mkdir /c/DATI/aws

Option 1: AWS credentials in configuration file

Configure the AWS cli with your credentials and region:

$ aws configure
AWS Access Key ID [None]: MYACCESSKEY
AWS Secret Access Key [None]: MYSECRETKEY
Default region name [None]: eu-west-1
Default output format [None]: json

The credentials will be saved in

~/.aws/credentials 

Setup an alias to run the CLI

echo "alias aws='''docker run --rm -ti -v ~/.aws:/root/.aws -v /c/DATI/aws:/aws amazon/aws-cli'''" >> ~/.bashrc
source ~/.bashrc

Test it works running

aws --version

Option 2: AWS credentials in env vars

If env vars are your preferred method, create a file named aws.env with the following content (edit with you credentials)

AWS_ACCESS_KEY_ID=MYACCESSKEY
AWS_SECRET_ACCESS_KEY=MYSECRETKEY
AWS_DEFAULT_REGION=eu-west-1
AWS_DEFAULT_OUTPUT=json

and then update the alias to use env vars

echo 'alias aws="docker run --rm -ti --env-file=/c/DATI/aws/aws.env -v /c/DATI/aws:/aws amazon/aws-cli"' >> ~/.bashrc
source ~/.bashrc

Test it works running

aws --version

AWS Cloudformation

Cloudformation is an IaC tool to manage AWS resources. That means the cloud infrastructure is defined in a text file (code) that is the source to create all the elements.

Create a template

Let’s start creating a VPC in region eu-west and three subnets, each subnet in a different Availability Zone.

We need a template file with the definition of the resources to be created:

AWSTemplateFormatVersion: "2010-09-09"
Description: VPC in Ireland
Resources:
    TestVPC:
        Type: AWS::EC2::VPC
        Properties:
            CidrBlock: "10.99.0.0/16"
            InstanceTenancy: default
            Tags:
                - Key: Name
                  Value: TestVPC
                - Key: Environment
                  Value: Testing
    Subnet1:
        DeletionPolicy: Retain
        Type: AWS::EC2::Subnet
        Properties:
            AvailabilityZone: eu-west-1a
            CidrBlock: "10.99.1.0/24"
            Tags:
                - Key: Name
                  Value: Subnet1a
            VpcId: !Ref TestVPC
    Subnet2:
        Type: AWS::EC2::Subnet
        Properties:
            AvailabilityZone: eu-west-1b
            CidrBlock: "10.99.2.0/24"
            Tags:
                - Key: Name
                  Value: Subnet1b
            VpcId: !Ref TestVPC
    Subnet3:
        Type: AWS::EC2::Subnet
        Properties:
            AvailabilityZone: eu-west-1c
            CidrBlock: "10.99.3.0/24"
            Tags:
                - Key: Name
                  Value: Subnet1c
            VpcId: !Ref TestVPC

Save it in the shared folder, in my case the file is

/c/DATI/aws/createVPC.yaml

Validate the template

aws cloudformation validate-template --template-body file://createVPC.yaml

The template I used for this post is short with onyl a few elements. AWS Cloudformation can be used to create all the objects available in the Console.

Run the template

Now we have a valid template and we can run cloudformation to create the VPC and subnets from that:

aws cloudformation create-stack --stack-name CreateMyVPC --template-body file://createVPC.yaml

Expected output

{
    "StackId": "arn:aws:cloudformation:eu-west-1:752155438581:stack/CreateMyVPC/656fa2d0-7506-11ea-9de1-020123456789"
}

Verify the stack

aws cloudformation describe-stacks --stack-name CreateMyVPC

Expected output

{
    "Stacks": [
        {
            "StackId": "arn:aws:cloudformation:eu-west-1:752155438581:stack/CreateMyVPC/    8ffef150-7681-11ea-a3d5-02c18823f600",
            "StackName": "CreateMyVPC",
            "Description": "VPC in Ireland",
            "CreationTime": "2020-04-04T14:35:42.739000+00:00",
            "RollbackConfiguration": {},
            "StackStatus": "CREATE_COMPLETE",
            "DisableRollback": false,
            "NotificationARNs": [],
            "Tags": [],
            "EnableTerminationProtection": false,
            "DriftInformation": {
                "StackDriftStatus": "NOT_CHECKED"
            }
        }
    ]
}

We can use some filters on the command output to get olnly some values

aws cloudformation describe-stacks --stack-name CreateMyVPC --query "Stacks[0].StackStatus"

Output

"CREATE_COMPLETE"

Now take a lok in the AWS console, the VPC is there

and the subnets too

Well done!

Drift detection

Elvise used to sing Are you sorry we drifted apart?

Sometimes the infrastucture drifts apart form what was the original design.

Configuration drift occurs when a standardized group of IT resources that is built from a standard template, diverge in configuration over time.

If an IaC process is in place, that means somebody messed with the infrastucture, a Hero probably.

AWS has a tool to check if any element in the running infrastucture had drifted from the template.

Let’s detect a drift creating a new

aws cloudformation detect-stack-drift --stack-name CreateMyVPC

Output

{
    "StackDriftDetectionId": "74e1a130-750d-11ea-94ec-062924358418"
}

Copy the Detection Id for the next command used to analyze the drift

aws cloudformation describe-stack-drift-detection-status --stack-drift-detection-id 74e1a130-750d-11ea-94ec-062924358418

Ouput

{
    "StackId": "arn:aws:cloudformation:eu-west-1:752155438581:stack/CreateMyVPC/    656fa2d0-7506-11ea-9de1-02f0c9b42810",
    "StackDriftDetectionId": "74e1a130-750d-11ea-94ec-062924358418",
    "StackDriftStatus": "IN_SYNC",
    "DetectionStatus": "DETECTION_COMPLETE",
    "DriftedStackResourceCount": 0,
    "Timestamp": "2020-04-02T18:12:04.419000+00:00"
}

Notice the StackDriftStatus is IN_SYNC, that’s expected.

Use query again for a cleaner output

aws cloudformation describe-stack-drift-detection-status --stack-drift-detection-id 74e1a130-750d-11ea-94ec-062924358418 --query StackDriftStatus

Output

"IN_SYNC"

Now wear your Hero cap, go to the AWS console and delete a subnet to create a drift

Run the detection again

aws cloudformation detect-stack-drift --stack-name CreateMyVPC

Analyze the new drift

aws cloudformation describe-stack-drift-detection-status --stack-drift-detection-id dd493ad0-750d-11ea-a53e-0a6a2f25f2be

Output

{
    "StackId": "arn:aws:cloudformation:eu-west-1:752155438581:stack/CreateMyVPC/    656fa2d0-7506-11ea-9de1-02f0c9b42810",
    "StackDriftDetectionId": "dd493ad0-750d-11ea-a53e-0a6a2f25f2be",
    "StackDriftStatus": "DRIFTED",
    "DetectionStatus": "DETECTION_COMPLETE",
    "DriftedStackResourceCount": 1,
    "Timestamp": "2020-04-02T18:14:59.581000+00:00"
}

A new StackDriftDetectionId is created, notice the status nos is DRIFTED, meaning some irresponsbile made some changes in the infrastructure not using the IaC workflow through the Cloudformation template.

But what happened exactly? Let’s see the details

aws cloudformation  describe-stack-resource-drifts --stack-name CreateMyVPC

Among the output well find many resources IN_SYNC but one

{
    "StackId": "arn:aws:cloudformation:eu-west-1:752155438581:stack/CreateMyVPC/    -7506-11ea-9de1-02f0c9b42810",
    "LogicalResourceId": "Subnet3",
    "PhysicalResourceId": "subnet-06cc5d5d4e4008655",
    "ResourceType": "AWS::EC2::Subnet",
    "StackResourceDriftStatus": "DELETED",
    "Timestamp": "2020-04-02T18:15:00.796000+00:00"
},

We found the smoking gun: Subnet3 was deleted.

AWS CloudWatch logs can tell the user who did the change so we can gently remind him/her to be more careful in the future, with some help from our LART.

Wrap up

IaC tools exist and are getting better and better at their job.

Migrating existing infrastructures and shift the culture of a team to IaC may be the real hard task, much more than the technology.

The migration of some workload to the cloud may be the best change to start greenfield and try an IaC approach.

For network engineers like me the template of AWS can be a SoT like Netbox, the Cloudformation function can be Ansible or Nornir. Take a look at Cisco NX-OS Netbox Sync for example.

In any case, IaC has many advantages and it makes sense to apply some, if not all, its workflows to most IT infrastructure.

The lesson the public cloud is teaching is that a different and better way to create and manage networks and IT infrastructures in general exists and it is a mistake to ignore it.