Course Outline
Foundations of Cloud Operations on AWS
- Operational roles and responsibilities in the cloud
- AWS account structure, organizations, and multi-account strategy
- Core operational services: CloudWatch, CloudTrail, AWS Config
Infrastructure as Code and Provisioning
- Principles of IaC and immutable infrastructure
- Provisioning with Terraform and AWS CloudFormation
- Managing state, modules, and environment promotion
CI/CD and Deployment Strategies
- Designing CI/CD pipelines for cloud-native apps
- Blue/green, canary, and rolling deployments
- Automating rollback, health checks, and release validation
Monitoring, Observability, and Alerting
- Metrics, logs, and traces: ship, store, and analyze
- Using CloudWatch, X-Ray, and third-party observability tools
- Defining SLOs/SLIs, alerting policies, and on-call practices
Security Operations and Identity Management
- IAM best practices, least privilege, and cross-account access
- Secrets management, KMS, and secure parameter stores
- Operational security: patching strategies, vulnerability scanning, and audit trails
Resilience, Backup, and Disaster Recovery
- Designing for fault tolerance and high availability
- Backup strategies, snapshot automation, and restore procedures
- Disaster recovery planning and runbook creation
Cost Optimization and Governance
- Cost visibility: billing, tagging, and cost allocation strategies
- Rightsizing, reserved instances/savings plans, and budgeting controls
- Governance: policies, guardrails, and automation for compliance
Containers, Serverless, and Runtime Operations
- Operational considerations for ECS, EKS, and Lambda
- Service discovery, autoscaling, and resource limits
- Logging, tracing, and debugging containerized workloads
Incident Response, Playbooks, and Chaos Engineering
- Runbook-driven incident response and postmortem practices
- Automating remediation and self-healing patterns
- Intro to chaos experiments for validating resilience
Hands-on Workshop: Operate a Sample Workload
- Deploy a sample application using IaC and a CI/CD pipeline
- Implement monitoring, alerts, and an automated remediation script
- Simulate incidents and practice runbook-based response
Summary and Next Steps
Requirements
- A basic understanding of cloud concepts and networking
- Familiarity with Linux command line and scripting
- Experience with source control (Git) and basic CI/CD concepts
Audience
- Cloud operations engineers
- SREs and platform engineers
- DevOps engineers and technical team leads
Testimonials (5)
Trainer had good grasp of concepts
Josheel - Verizon Connect
Course - Amazon Redshift
The practice part.
Radu - Ness Digital Engineering
Course - AWS: A Hands-on Introduction to Cloud Computing
The training was more practical
Siphokazi Biyana - Vodacom SA
Course - Kubernetes on AWS
All good, nothing to improve
Ievgen Vinchyk - GE Medical Systems Polska Sp. Z O.O.
Course - AWS Lambda for Developers
IOT applications