AWS (Amazon Web Services)¶

Cloud platform that runs half the internet. 200+ services (you'll use maybe 10). 33+ regions globally. Industry standard for cloud infrastructure. Pricing is a mystery, bills are scary, but it works.

2026 Update

AWS continues to dominate cloud infrastructure with over 200 services. Focus on core services (EC2, S3, Lambda, RDS) and learn IAM inside-out. Cost optimization is more critical than ever.

Quick Hits¶

Essential Services Common Patterns Pro Tips & Gotchas

# EC2 - Virtual machines (you'll use this)
aws ec2 run-instances \
  --image-id ami-xxx \
  --instance-type t3.micro \
  --key-name my-key # (1)!

# S3 - Object storage (everyone uses this)
aws s3 cp file.txt s3://bucket-name/
aws s3 sync ./local s3://bucket/path --delete # (2)!

# Lambda - Serverless functions (scales like crazy)
aws lambda invoke \
  --function-name my-function \
  --payload '{"key":"value"}' \
  output.txt # (3)!

# RDS - Managed databases (don't run your own DB)
aws rds describe-db-instances
aws rds create-db-snapshot --db-instance-identifier prod-db # (4)!

# IAM - Identity management (painful but critical)
aws iam create-user --user-name dev-user
aws sts get-caller-identity  # "Who the fuck am I?"
aws iam list-attached-user-policies --user-name dev-user # (5)!

# CloudWatch - Logs and monitoring (set billing alarms!)
aws logs tail /aws/lambda/my-function --follow
aws cloudwatch put-metric-alarm \
  --alarm-name BillingAlarm \
  --alarm-description "Alert when bill exceeds $100" \
  --metric-name EstimatedCharges \
  --threshold 100 # (6)!

# ECS/EKS - Container orchestration
aws ecs list-clusters
aws eks list-clusters
aws eks update-kubeconfig --name my-cluster # (7)!

SSH key for instance access - create with aws ec2 create-key-pair
--delete removes files from S3 that don't exist locally - dangerous!
Lambda invoke is synchronous - use --invocation-type Event for async
Always snapshot before major changes - saved my ass multiple times
Check permissions when debugging "Access Denied" errors
Set this up DAY ONE - learn from others' $10k+ billing surprises
Updates your kubeconfig for kubectl access

Real talk:

Start with EC2, S3, RDS - that's 80% of use cases
IAM is hell, but you MUST learn it - security nightmare otherwise
Enable MFA on root account RIGHT NOW (seriously, stop reading and do it)
us-east-1 is cheapest but goes down more often (Murphy's law applies)
Use --profile for multiple accounts (you'll have dev/staging/prod)

import boto3
from botocore.exceptions import ClientError

# S3 upload with proper error handling
def upload_to_s3(file_path, bucket, key):
    """Upload file to S3 with private ACL."""
    s3 = boto3.client('s3')
    try:
        s3.upload_file(
            file_path,
            bucket,
            key,
            ExtraArgs={'ACL': 'private'}  # Don't leak shit # (1)!
        )
        return True
    except ClientError as e:
        print(f"Upload failed: {e}")
        return False

# Lambda handler pattern (use this)
def lambda_handler(event, context):
    """Standard Lambda handler with proper error handling."""
    try:
        # Parse event (API Gateway, SQS, etc.)
        body = json.loads(event.get('body', '{}'))

        # Do work
        result = process_data(body)

        # Return proper response
        return {
            'statusCode': 200,
            'headers': {
                'Content-Type': 'application/json',
                'Access-Control-Allow-Origin': '*'  # Adjust for prod # (2)!
            },
            'body': json.dumps(result)
        }
    except Exception as e:
        print(f"Error: {e}")  # Goes to CloudWatch # (3)!
        return {'statusCode': 500, 'body': 'Internal error'}

# DynamoDB pattern (NoSQL done right)
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('Users')

# Query (efficient) - use this
response = table.query(
    KeyConditionExpression='user_id = :uid',
    ExpressionAttributeValues={':uid': '12345'}
) # (4)!

# Scan (expensive) - avoid in prod
response = table.scan(Limit=100)  # Will cost $$$$ at scale # (5)!

Always set ACL to private unless you specifically need public access
Lock down CORS in production - * is for development only
Lambda logs go to CloudWatch automatically - use structured logging for production
Queries use indexes - fast and cheap
Scans read entire table - slow and expensive, only for admin tasks

# CloudFormation/SAM pattern (infrastructure as code)
AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31

Resources:
  MyFunction:
    Type: AWS::Serverless::Function
    Properties:
      Runtime: python3.12  # Use latest runtime # (1)!
      Handler: app.lambda_handler
      Timeout: 30  # Seconds - adjust based on workload
      MemorySize: 512  # MB - more memory = faster CPU # (2)!
      Environment:
        Variables:
          TABLE_NAME: !Ref MyTable
      Policies:
        - DynamoDBCrudPolicy:
            TableName: !Ref MyTable  # Least privilege # (3)!

  MyTable:
    Type: AWS::DynamoDB::Table
    Properties:
      BillingMode: PAY_PER_REQUEST  # No capacity planning # (4)!
      AttributeDefinitions:
        - AttributeName: id
          AttributeType: S
      KeySchema:
        - AttributeName: id
          KeyType: HASH
      StreamSpecification:  # For DynamoDB Streams
        StreamViewType: NEW_AND_OLD_IMAGES # (5)!

Python 3.12 available since 2024 - use latest for performance
Lambda charges by GB-seconds - 512MB is sweet spot for most workloads
Only grant permissions this function actually needs
Pay per request - no provisioned capacity, scales automatically
Streams enable event-driven architectures - trigger Lambda on changes

Why this works:

Boto3 is official AWS SDK - well maintained, good docs
Error handling prevents silent failures
CloudFormation/SAM enables version control for infrastructure
DynamoDB queries scale better than scans (use indexes!)
Lambda handler pattern is battle-tested across millions of functions

Cost Optimization (your CFO will thank you)

Reserved Instances: 72% savings for predictable workloads (1-3 year commitment)
Spot Instances: 90% savings for batch jobs (can be terminated with 2-min warning)
S3 Intelligent-Tiering: Automatic cost optimization based on access patterns
CloudWatch billing alarms: Set up IMMEDIATELY - prevent $10k+ surprises
Delete unused resources: Snapshots, AMIs, elastic IPs add up fast
Use AWS Cost Explorer: Analyze spending patterns, identify waste
Tag everything: Enable cost allocation by project/team/environment

Security (don't get hacked)

Never hardcode credentials - use IAM roles, instance profiles, or Systems Manager
Enable CloudTrail + GuardDuty - detect breaches before bankruptcy
Systems Manager Parameter Store - free for <10k parameters, encrypted at rest
VPC Flow Logs - network debugging and security analysis
Least privilege IAM policies - start restrictive, open up as needed
MFA on root account - this is non-negotiable, do it now
AWS Security Hub - centralized security findings (2026 standard)

Performance

Same region/AZ traffic - cross-region costs money + latency (100ms+)
CloudFront CDN - S3 alone is slow for users, CDN is sub-50ms globally
RDS read replicas - scale read-heavy workloads horizontally
ElastiCache - Redis/Memcached for sub-ms caching (game changer)
Lambda provisioned concurrency - eliminate cold starts for critical paths
Use AWS PrivateLink - avoid internet gateway for service-to-service

Gotchas (learn from others' pain)

Data transfer OUT - in is free, out costs $$$ (especially cross-region)
NAT Gateway costs - can exceed EC2 instance costs (use VPC endpoints instead)
CloudWatch Logs - verbose logging = expensive storage ($0.50/GB)
DynamoDB scans - will bankrupt you at scale, always use queries with indexes
Lambda cold starts - 1-2 seconds for large functions (use provisioned concurrency)
EBS snapshots - incremental but deletions are confusing (read the docs!)
S3 bucket policies - one wrong character = data leak (test with IAM simulator)

Monitoring & Observability

CloudWatch - included but basic, good enough for small/medium workloads
X-Ray - distributed tracing for microservices debugging (2026 essential)
Datadog/New Relic - consider for serious production monitoring
Set up alarms for: billing, CPU >80%, disk >90%, error rates >1%
Use CloudWatch Insights - query logs with SQL-like syntax
CloudWatch RUM - real user monitoring for frontend performance (2026)

When NOT to use AWS

Small personal projects - Vercel/Netlify/Railway way easier (and cheaper)
Vendor lock-in concerns - consider Kubernetes on any cloud
Tiny budget - free tier ends after 12 months, bills start
No cloud experience - steep learning curve, invest time in fundamentals first
Compliance hell - some industries require on-prem (banking, healthcare)

Learning Paths¶

Free Resources¶

AWS Skill Builder - Official training, tons of free courses (start here)
AWS Free Tier - 12 months free for core services (stay within limits!)
freeCodeCamp AWS Course - 10+ hour deep dive, quality content
AWS Workshops - Hands-on labs, various topics
A Cloud Guru Free Tier - Quality video courses
AWS Getting Started Guides - Official tutorials
AWS re:Post - Official Q&A platform (replaced forums in 2024)

Interactive Labs¶

AWS Sandbox Accounts - Official hands-on tutorials in real AWS
Qwiklabs AWS - Temporary accounts for safe experimentation
Instruqt AWS Labs - Browser-based scenarios
LocalStack - Run AWS locally for development (Pro version worth it)
AWS CloudShell - Browser-based shell with AWS CLI pre-installed

Certifications Worth It¶

Recommended Path

Solutions Architect Associate - Most valuable, industry standard
Developer Associate - If you code daily on AWS
Skip others unless employer pays or senior role requires

Cloud Practitioner - $100, easiest, good starting point if totally new
Solutions Architect Associate - $150, most popular, worth it for resume (this one matters)
Developer Associate - $150, worth it if you code on AWS daily
SysOps Administrator Associate - $150, operations-focused
Skip unless senior/employer pays: Professional certs ($300), Specialty certs ($300) - overkill for most

Reality check:

Solutions Architect Associate is the sweet spot (most job postings ask for this)
Study 2-3 months with hands-on practice, exams are scenario-based
Use Tutorials Dojo practice exams ($15, best investment)
Join r/AWSCertifications for study tips

Projects to Build¶

Beginner (learn the basics)

Static website - S3 + CloudFront + Route53 (learn storage + CDN)
Serverless URL shortener - Lambda + DynamoDB + API Gateway
EC2 web server - Deploy LAMP/NGINX stack manually

Intermediate (portfolio-worthy)

REST API - Lambda + API Gateway + DynamoDB + Cognito auth
File processing pipeline - S3 triggers Lambda, stores results in RDS
CI/CD pipeline - CodePipeline + CodeBuild + ECR + ECS
Serverless blog - Amplify + Lambda + DynamoDB + S3

Advanced (job-interview flex)

Multi-region application - Route 53 failover + RDS cross-region replicas
Event-driven microservices - SQS/SNS/EventBridge architecture
Cost optimization dashboard - Lambda + Cost Explorer API + QuickSight
Real-time analytics - Kinesis Data Streams + Lambda + Timestream

Architecture Patterns¶

Common AWS architecture patterns for real-world applications.

Architecture Diagram Resources

AWS Architecture Icons - Official icon set for creating AWS architecture diagrams (PPT, Draw.io, Visio formats). Essential for documentation and presentations.

Three-Tier Web Application¶

Classic pattern: presentation, application, data layers with high availability.

graph TB
    subgraph "Internet"
        Users[Users]
    end

    subgraph "AWS Cloud"
        subgraph "Availability Zone 1"
            ALB1[Application<br/>Load Balancer]
            EC2_1[EC2 Instance<br/>App Server]
            RDS_Primary[(RDS Primary<br/>PostgreSQL)]
        end

        subgraph "Availability Zone 2"
            EC2_2[EC2 Instance<br/>App Server]
            RDS_Standby[(RDS Standby<br/>Read Replica)]
        end

        S3[S3 Bucket<br/>Static Assets]
        CloudFront[CloudFront CDN]
    end

    Users -->|HTTPS| CloudFront
    CloudFront -->|Static| S3
    CloudFront -->|Dynamic| ALB1
    ALB1 --> EC2_1
    ALB1 --> EC2_2
    EC2_1 -->|Read/Write| RDS_Primary
    EC2_2 -->|Read/Write| RDS_Primary
    RDS_Primary -.->|Replicate| RDS_Standby

    style Users fill:#e1f5ff
    style CloudFront fill:#ff9900
    style S3 fill:#569a31
    style ALB1 fill:#ff9900
    style EC2_1 fill:#ff9900
    style EC2_2 fill:#ff9900
    style RDS_Primary fill:#527fff
    style RDS_Standby fill:#527fff

Components:

CloudFront: Global CDN, caches static assets at edge locations
S3: Object storage for images, CSS, JavaScript
ALB: Distributes traffic across EC2 instances in multiple AZs
EC2: Application servers running in Auto Scaling group
RDS: Managed PostgreSQL with multi-AZ failover

Real talk: This pattern handles 10k-100k requests/day. Add auto-scaling for growth.

Serverless Microservices¶

Event-driven architecture with Lambda, API Gateway, and DynamoDB.

graph LR
    Client[Mobile/Web<br/>Client] -->|HTTPS| APIGW[API Gateway<br/>REST API]
    APIGW -->|Invoke| Auth[Lambda<br/>Auth Function]
    APIGW -->|Invoke| Users[Lambda<br/>Users Service]
    APIGW -->|Invoke| Orders[Lambda<br/>Orders Service]

    Auth -->|Read/Write| Cognito[Cognito<br/>User Pool]
    Users -->|Read/Write| UserDB[(DynamoDB<br/>Users Table)]
    Orders -->|Read/Write| OrderDB[(DynamoDB<br/>Orders Table)]

    Orders -->|Publish| SNS[SNS Topic<br/>Order Events]
    SNS -->|Subscribe| Email[Lambda<br/>Email Service]
    SNS -->|Subscribe| SQS[SQS Queue]
    SQS -->|Process| Worker[Lambda<br/>Worker Function]

    style Client fill:#e1f5ff
    style APIGW fill:#ff9900
    style Auth fill:#ff9900
    style Users fill:#ff9900
    style Orders fill:#ff9900
    style Email fill:#ff9900
    style Worker fill:#ff9900
    style Cognito fill:#dd344c
    style UserDB fill:#527fff
    style OrderDB fill:#527fff
    style SNS fill:#ff9900
    style SQS fill:#ff9900

Components:

API Gateway: RESTful API with authentication, rate limiting, caching
Lambda: Stateless functions, auto-scale, pay-per-invocation
DynamoDB: NoSQL database with single-digit millisecond latency
SNS/SQS: Async messaging for decoupled microservices
Cognito: User authentication and authorization

Real talk: Scales to millions of requests, costs pennies at low traffic. Cold starts are 100-500ms.

Data Pipeline Architecture¶

ETL pattern for processing large datasets with S3, Glue, and Athena.

graph TB
    Sources[Data Sources<br/>Logs, APIs, Databases] -->|Stream| Kinesis[Kinesis Data<br/>Streams]
    Sources -->|Batch| S3_Raw[S3 Raw Bucket<br/>Landing Zone]

    Kinesis -->|Real-time| Firehose[Kinesis Data<br/>Firehose]
    Firehose --> S3_Raw

    S3_Raw -->|Trigger| Glue[AWS Glue<br/>ETL Jobs]
    Glue -->|Transform| S3_Processed[S3 Processed<br/>Parquet Format]

    S3_Processed -->|Catalog| GlueCatalog[Glue Data<br/>Catalog]
    GlueCatalog -->|Query| Athena[Athena<br/>SQL Queries]
    GlueCatalog -->|Visualize| QuickSight[QuickSight<br/>Dashboards]

    S3_Processed -->|Train| SageMaker[SageMaker<br/>ML Models]

    style Sources fill:#e1f5ff
    style Kinesis fill:#ff9900
    style Firehose fill:#ff9900
    style S3_Raw fill:#569a31
    style Glue fill:#ff9900
    style S3_Processed fill:#569a31
    style GlueCatalog fill:#ff9900
    style Athena fill:#ff9900
    style QuickSight fill:#ff9900
    style SageMaker fill:#ff9900

Components:

Kinesis: Real-time data streaming (alternative to Kafka)
S3: Data lake storage (raw and processed data)
Glue: Serverless ETL, converts JSON/CSV to optimized Parquet
Athena: Query S3 data with SQL, pay per query ($5/TB scanned)
QuickSight: BI dashboards, ML-powered insights

Real talk: Processes terabytes for cents. Use Parquet format (10x cheaper queries than JSON).

Well-Architected Framework¶

AWS's five pillars for building reliable, secure, efficient systems.

Security¶

Design Principles:

Identity and Access Management - Use IAM roles, never embed credentials
Detective Controls - Enable CloudTrail, GuardDuty, Config
Infrastructure Protection - VPC isolation, security groups, NACLs
Data Protection - Encrypt at rest (KMS) and in transit (TLS)
Incident Response - Automated remediation with Lambda

Security Checklist

[ ] Root account MFA enabled
[ ] IAM users have MFA
[ ] S3 buckets are private (no public access)
[ ] RDS encryption enabled
[ ] CloudTrail logging to S3
[ ] GuardDuty threat detection active
[ ] Security groups follow least privilege
[ ] Secrets stored in Secrets Manager
[ ] VPC Flow Logs enabled
[ ] AWS Config rules for compliance

Reliability¶

Design Principles:

Multi-AZ Deployment - RDS, ALB, EC2 across 2+ availability zones
Auto Scaling - Respond to demand changes automatically
Backup and Recovery - Automated snapshots, cross-region replication
Change Management - Infrastructure as code (CloudFormation/Terraform)
Failure Isolation - Bulkheads prevent cascading failures

Reliability Targets

Availability	Downtime/Year	Architecture
99.0% (2 nines)	3.65 days	Single AZ
99.9% (3 nines)	8.76 hours	Multi-AZ
99.95%	4.38 hours	Multi-AZ + Auto Scaling
99.99% (4 nines)	52.56 minutes	Multi-region
99.999% (5 nines)	5.26 minutes	Multi-region + Failover

Performance Efficiency¶

Design Principles:

Selection - Choose right compute (EC2 vs Lambda vs Fargate)
Review - Continuously evaluate new services
Monitoring - CloudWatch metrics, X-Ray tracing
Trade-offs - Consistency vs latency, normalization vs denormalization

Service Selection Guide

graph TD
    Start{Compute Need?} -->|Containers| Container{Orchestration?}
    Start -->|VMs| VM{Persistent?}
    Start -->|Functions| Lambda[Lambda<br/>Event-driven]

    Container -->|Yes| EKS[EKS<br/>Kubernetes]
    Container -->|No| ECS[ECS/Fargate<br/>Simpler]

    VM -->|Yes| EC2[EC2<br/>Full Control]
    VM -->|No| Batch[AWS Batch<br/>Job Scheduling]

    style Start fill:#e1f5ff
    style Lambda fill:#ff9900
    style EKS fill:#ff9900
    style ECS fill:#ff9900
    style EC2 fill:#ff9900
    style Batch fill:#ff9900

Cost Optimization¶

Design Principles:

Right Sizing - Match instance size to workload (don't over-provision)
Elasticity - Auto-scale down during off-peak hours
Pricing Models - Reserved Instances (72% off), Spot (90% off)
Managed Services - RDS cheaper than self-managed EC2 databases
Cost Allocation - Tag everything for chargeback/showback

Cost Saving Strategies

Strategy	Savings	Best For
Reserved Instances (1yr)	40%	Predictable workloads
Reserved Instances (3yr)	72%	Long-term commitments
Spot Instances	90%	Fault-tolerant, flexible
Savings Plans	72%	Flexible compute usage
S3 Intelligent-Tiering	70%	Infrequently accessed data
Lambda vs EC2	80%	Low-traffic APIs
Graviton Instances	40%	ARM-compatible workloads

Operational Excellence¶

Design Principles:

Operations as Code - Infrastructure as code, runbooks as code
Frequent, Small Changes - Reduce blast radius of failures
Refine Operations - Learn from failures, improve processes
Anticipate Failure - Chaos engineering, game days
Learn from Failures - Post-mortems without blame

Operational Metrics

MTTR - Mean Time To Recovery (target: <1 hour)
Change Failure Rate - Failed changes / total changes (target: <15%)
Deployment Frequency - Daily for high-performing teams
Lead Time - Code commit to production (target: <1 day)

Community Pulse¶

Who to Follow¶

Twitter/X:

@awscloud - Official updates, new service launches
@QuinnyPig - Corey Quinn, AWS cost optimization, hilarious roasts
@ben11kehoe - Serverless expert, AWS Community Builder
@nathankpeck - AWS Principal Dev Advocate, ECS/containers expert
@jeremy_daly - Serverless champion, great technical insights
@neiltheblue - Solutions Architect, hands-on tutorials
@esh - AWS Chief Evangelist (Jeff Barr)

YouTube/Streamers:

AWS Online Tech Talks - Deep dives, re:Invent sessions
FooBar Serverless - Serverless tutorials
Be A Better Dev - Practical AWS projects
TechWorld with Nana - DevOps + AWS tutorials
AWS Events - Conference talks, workshops

Active Communities¶

r/aws - 250k+ members, active daily, mix of beginner + advanced (best community)
AWS Community Discord - Official, helpful, core team present
Dev.to #aws - Quality tutorials, case studies, community posts
AWS Community Builders - Official program, great networking
AWS re:Post - Official Q&A (replaces old forums)
ServerlessLand Community - Serverless-focused, active Slack/Discord

Podcasts & Newsletters¶

Podcasts:

AWS Podcast - Official, weekly, new features + customer stories
Screaming in the Cloud - Corey Quinn, hilarious, critical of AWS (in good way)
AWS TechChat - Technical deep dives
AWS Morning Brief - Short daily AWS news

Newsletters:

Last Week in AWS - Weekly, irreverent, critical analysis (subscribe now)
Off-by-none - Serverless newsletter, Jeremy Daly, quality content
AWS Week in Review - Official blog, weekly updates
AWS Open Source News - Open source projects on AWS

Events & Conferences¶

AWS re:Invent - Las Vegas, late November, 50k+ attendees, $2k+ (worth it once in career)
AWS Summit - Free, regional (20+ cities), good for networking
AWS Community Day - Free, community-organized, worldwide, quality talks
ServerlessDays - Free/cheap, serverless-focused, technical talks

Worth Checking¶

Official Docs

AWS Documentation

AWS CLI Reference

AWS Architecture Center

Well-Architected Framework
Hands-on Practice

AWS Free Tier

AWS Workshops

Qwiklabs AWS Track

LocalStack
Code Examples

Awesome AWS

AWS Samples (GitHub)

Serverless Examples

CDK Patterns

AWS CDK Examples
Deep Dives

AWS Well-Architected

Last Week in AWS Blog

AWS Heroes Blogs

AWS This Week Newsletter

Corey Quinn's Blog
Tools & CLIs

AWS CLI v2

AWS CDK (Infrastructure as Code)

Serverless Framework

AWS SAM

AWS Toolkit for VSCode

Steampipe (SQL for AWS APIs)
News & Updates

AWS What's New

AWS Blog

r/aws Subreddit

Hacker News AWS

AWS Status Dashboard

Last Updated: 2026-01-31 | Vibe Check: Mainstream - AWS is the default cloud. Not the coolest kid anymore (Vercel/Railway have better DX), but runs most production workloads. If you're doing cloud professionally, you're learning AWS.

Tags: aws, cloud