In today’s post, I’m going to cover doing blue / green deployment on AWS, and share some notes about building Ci/ CD pipelines on AWS. Since this is my summary from the AWS whitepapers, the entire post isn’t necessarily written in English.
Blue / Green Deployment
Blue / Green: Release apps by shifting traffic between 2 identical env, running different versions of the app
- Near zero downtime
- Easy rollback (revert traffic back to the still-operating blue)
- Isolation between blue & green application
🧡 Goal of B / G: Achieve Immutable Infrastructure (Need not make changes to application after deployment)
B / G with AWS services
1) Route 53
- DNS, classic approach
- Direct traffic by updating DNS records, set shorter TTL
2) ELB
- Health check against EC2 resources, increase fault tolerance
3) Auto Scaling
- Enable B/G: Attach different versions of the launch configuration to AS group
- Add ELB: balance traffic across EC2 instances running in AS groups
- Standby state / termination policies: quick rollback
4) Beanstalk
- EB supports AS & ELB for B/G
- Run multiple versions by swapping environment URLs
5) OpsWorks
- Based on Chef
- Simplifies cloning entire stack
6) CF
- Describe the AWS resources they need through JSON / yaml
- Provision B/G, switch traffic via Route 53 / ELB
- Infrastructure as code: Version control & CI
7) CW
- Collect & track metrics
- Collect & monitor logs
- Set alarms
- System-wide visibility into resource utilization, application performance & operational health
B / G Technique
1) Update DNS routing with Route 53
- DNS routing through record updates (Aliases)
- Express endpoint into the environment as a DNS name / IP
- Can do a weighted distribution (gradual shift with Route 53), define traffic percentage (canary analysis)
- Rollback by updating DNS record, to shift traffic back to blue (TTL, how long clients cache query results)
- Applies to:
- Public / Elastic IP, or expose IP / DNS endpoint
- EC2 Instances / ECS clusters behind ELB, or in AS groups with ELB as frontend
- EB web tiers
2) Swap AS groups behind ELB
ELB: health check (New instances auto added to the LB pool, if they pass health check)
AS: replace unhealthy instances
Health check occurs at configurable intervals
Deploy: Attach green group to LB, put blue in Standby state
3) Update AS group Launch configurations
A launch configuration: AMI ID (Amazon Machine Image), instance type, key pair, security groups, etc.
Associate only one launch config with an AS group, unchangeable after you create it
Change launch config: Replace existing config with a new one
Default termination policy: Remove instances with oldest launch config
Deploy
- Update AS group with new launch config
- Scale AS group *** 2**
- Shrink As group back to original size (instances with old configs `are removed)
Instances with standby state: quick rollback
- Update AS group with old launch config
- Do the steps above in reverse
4) Swap Environment of EB application
- In-place update on existing instances (downtime during update)
- Immutable deployment using new instance sets
- Swap Environment URLs from Actions
- EB performs a DNS switch
5) Clone a Stack in OpsWorks & Update DNS
Stacks: logical grouping of AWS resources, one or more layers
Deploy: Update DNS records to point to green (stack’s LB)
Other
Best Practice for Data Sync & Schema Change
- Decoupling schema change from code change
- Additive: changed first
- Deletive: changed last
- Need to consider state (DB contains much state, but comparatively little logic & structure)
When NOT to use B / G
- Introduce additional points of failure
- Schema change is too complex, problem with data sync
CI / CD on AWS
1. CodeStar
Rapidly orchestrate an end-to-end software release workflow (pipeline)
2. Tests
Unit tests should make up the bulk of testing strategy (70%)
Staging Phase (Full environments are created to mirror real production environment)
- Integration test (interface between components)
- Component test (message passing between components)
- System test (end-to-end)
- Performance test (load / stress / spike tests)
- Compliance test
- User Acceptance Test (UAT, e2e business flow)
Production: Canary test
3. Build the Pipeline
- CI / CD stages: Source, build, staging, production
-
buildspec.yml
4. Deployment methods
除了 deploy in place, 其它四种都是近乎 zero downtime
1) Deploy in place
- All at once
- Downtime during updates
- Deploy: existing instances
- Rollback: Redeploy
2) Rolling
- Single batch out of service
- Deploy: existing instances
- Rollback: Redeploy
- Variation: Canary release
3) Rolling with additional batch
- Beanstalk ONLY
- Deploy: new & existing instances
- Rollback: Redeploy
4) Immutable
- Deploy: new instances
- Rollback: Redeploy
5) Blue / Green
- Deploy: new instances
- Rollback: Switch back to old environment
Best Practices
- Infrastructure as code, pipeline as code
- No long-running feature branches
- Build unit tests toward 100% coverage, takes 70% of overall testing
- Role-based security