EC2 Custom vs Out-Of-The-Box Metrics

Standard metric time for EC2 cloudwatch metrics is 5 minutes
Detailed monitoring is 1 minute
High resolution monitoring allows faster resolution (can be as low as 1 second)

EC2 CloudWatch metrics out-of-the-box

CPU
Network
Disk
Status Check

Note: RAM is a custom metric

Setting up a Custom EC2 Metric

Install the cloudwatch agent on your machine (using user data startup script)
You use a user data script to configure your EC2
User data scripts must provider their interpreter (Use a shebang (#!/bin/bash)
Scripts are executed as root
On Ubuntu you can run this as a crontab (put a shell script into a cron folder)

The Four EC2 Pricing Models

On Demand
Reserved
- Has an Upfront price (fixed cost)
- 3 year term (check this?)
- Useful for repeating resources (day/week/year)
Spot
- Very low prices
Dedicated
- No virtualisation with other tenants
- Good for certain licenses (Oracle)

EBS, The 4 Volume Types

Solid State Drives (They’re expensive, but much quicker)
GP2
- Is a general purpose volume (generic use cases)
- Up to 10,000 iops
- Performance is linked to the size of the volume
- Allows IOPS bursting
IO1
- Highest performance volume
- Mission critical stuff
- Provisioned IOPS
- Used for high throughput databases etc
- Hard Disk Drives (They’re cheaper, but performance is massively reduced)
ST1
- Can’t be boot, quite slow still
- Big data (non-production, but still performant)
SC1
- Basically for very low cost (but slow)
- Used for large and infrequent access
- Has a lower storage cost

EBS Volume Pre-Warming

Pre-warming is an optimisation to prevent cold access of EBS blocks
There is no need to warm EBS any more unless you’re restoring from S3, where you may want to pre-warm as each accessed block will have a “first touch penalty”.
- You can warm EBS with a bash command that touches all files internally
- Use the dd or fio command and read all files to /dev/null as a pre-warming process.

EBS Cloudwatch

Read/Writeops (I/O operations per second)
- Amount of IOPS per second
Queue length
- You really don’t want anything queuing

The Four EBS Volume Statuses

OK — Volume is running fine
Warning — Degraded performance, pushing to queue?
Impaired — Not working, maybe I/O is disabled?
Insufficient Data — Not enough data!

Applying changes: When you modify block storage, you might need to adjust your OS to see the higher volume size. You can also apply changes whilst the EBS is running.

ELBs

ELB have listeners (for connections using defined protocol and ports)
Register health checks on the ELB to ensure that service health (before routing)
You can load balance across regular instances and serverless
By default load balancers distribute across specified AZ’s

Types of ELB

Application Load Balancer
- Supports path based routing
- Can forward requests with modified headers (operates at application level)
Network Load Balancer
- Transport layer routing (TCP/SSL)
- Very fast, millions of RPS
- Forwards requests without modifying headers (it can’t)
- For very low latency needs
Classic Load Balancer (to be deprecated)
- Supports sticky sessions (using app generated cookies)
Ephemeral ports?

Monitoring ELB

Cloud Watch (for 4/5xx’s, healthy hosts count)
Access Logs (are disabled by default)
- They dump to S3 so can be hard to analyse (require 3rd party tool)
Request Tracing
- They’re good because they can access data after an ELB scales down
- Adds a header to requests before sending on (so they can be correlated)

Cloudwatch Dashboards

Dashboards aren’t regional specific
- Can add lines, stacks, numbers and charts
View data usage, memory
Remember to save your dashboard

Creating a Billing Alarm

You can setup a threshold when you want to be alerted
Create people who are to be alerted

CloudWatch Vs CloudTrail vs Config

Cloudwatch
- For logs and monitoring of applications (performance monitoring)
CloudTrail
- Allows you to see what changes have been made to the AWS API
Config
- Records the state of AWS environments

Elasticache monitoring metrics

CPU utilisation
Swap usage
- You don’t want to be using the swap file
- Usually equal to the size of your general elasticache
Evictions
- When an old piece of memory is swapped with a newer
Concurrent connections
- Find out if you are not releasing connections properly (assumed same as RDS)

AWS Organisations

Purpose of AWS orgs
AWS Organisations allow you to manage multiple AWS accounts
You can apply policies to your organisation that allow you to control access to AWS accounts
You can automate account access / creation
Can group accounts (departments, business functions etc)
Used to consolidate billing across different accounts
Can restrict what services people use

Tags & Resource Groups

Tags are key / value pairs (same as resources)
Resource groups are groups of tagged resources
AWS systems manager

AWS Config?

Is a management tool that records configuration changes
Is regional (needs to be turned on a per region basis)
Can schedule a lambda to be triggered on config change
Pipes results into S3 for analysis later
Can trigger periodically or on change
You can have up to 40 managed rules
Config needs read access to resources it’s tracking
Needs publish access to SNS to trigger notifications

Questions

What is AWS Config?
Read: https://aws.amazon.com/config/faq/
How to setup a billing alarm?
Understand resource groups better
Understand AWS config better

️️✅ 2 - Monitoring