AWS - 05 Serverless on AWS

In this post, we’re going to talk about running serverless applications on AWS. Below are some of the points from AWS whitepapers.

Optimize Economics with Serverless

Event-driven computing　 ─ 　FaaS　 ─ 　Serverless FaaS (Lambda)

Function as the unit of deployment & execution.

Serverless

Functional & microservice approach, where business logic is triggered only when required
Async, event-based
Lambda: Receive events / client invocation, then instantiate & run the code
Scale automatically, built-in fault tolerance

Serverless use case

Web / mobile backends
Media & log processing (compute-heavy workloads)
Automation (functions can be attached to alarms & monitors)
Near real-time streaming data processes
- Big data: parallelize with a serverless approach
- Low latency: moving serverless event handling to the internet edge (Lambda@Edge)

SAM

Open specification / blueprint. Application modeling framework
Orchestration & state management (serverless is stateless)
Manage all the steps in the SDLC
Deploy SAM apps with CF (SAM is specification, CF is implementation)
Edge locations: Key to low-latency serverless computing
Role-based & access-based permissions, and API-based authentication & access control

Lambda

Lambda@Edge is available in all edge locations
Auto retries for async & ordered events
DLQ: capture events that weren’t processed successfully
End user: Cognito

Serverless components

Developer tools
- CF for deployment
- X-Ray for diagnostics (cross-service request tracing & performance analysis)
- CW & CW Logs for monitoring
Orchestration
- Step Functions (create long-running workflows, state machines)
- CW Events (respond to events)
Streaming data:
- Kinesis Streams: near real-time analytics engine
- Kinesis Firehose: With Lambda
Compute: Lambda & Lambda@Edge (cloud logic layer)
API Proxy: API Gateway (HTTP endpoints)
Database: DynamoDB
Storage: S3 (Lambda function can be used as automatic event triggers, when changes on the object)

Construct serverless application

S3: static content
Lambda & API Gateway: Dynamic API requests
DynamoDB: store session & user state
Cognito: end-user registration, authentication (user pool), and access control to resources (identity pool)
SAM: describe elements of the app
CodeStar: CI / CD pipeline

Data processing (Lambda itself is stateless)

Lambda & Kinesis
Lambda & S3: trigger computation in response to object creation / event updates
Step Functions: stateful long-running workflows

Serverless & Lambda

Lambda

FaaS: Build reactive, event-driven system
Multiple, simultaneous events: Run more copies of the function in parallel
Lambda executes in a container (sandbox) that isolates it from other functions
Lambda also provides a RESTful API, which can directly invoke a Lambda function

Lambda function

Code
Configuration
Event sources (detect events & invoke function, e.g. API Gateway, SNS)

Run the code package

Download from S3 bucket
Install in the Lambda runtime environment (based on Amazon Linux AMI)
Invoke as needed

🧡 Handler

Specific code method. Specify the handler when create the Lambda function
Handler can call other methods & functions within the files & classes you’ve uploaded
Event object
Context object: allows function code to interact with the Lambda execution environment
- AWS RequestId
- Remaining time
- Logging: Stream log statements to CW logs

Lambda: Statelessness & Reuse

Warm container: already active, invoked before (faster code execution)
Cold start: create & invoke for the first time (slower)

Event Sources

Invocation patterns

Push Model (passive user)
Pull Model (active user)
- Polls data source, batching new records together in a single function invocation

Lambda functions can be executed async / sync. Choose InvocationType parameter. It has 3 possible values:

RequestResponse: Sync
Event: Async
DryRun: Test, not actually executing

Push model event source (Trigger Lambda)

S3 (Async)
API Gateway (Async / sync)
- Sync: API as Lambda proxy
- Async: API as AWS service proxy (return immediately with empty response)
SNS (Async): Automated response to CW alarms
CF (Sync)
CW Events (Async): AWS services publish resource state changes to CW Events (for event-driven ops automation)

Pull model event source (Lambda trigger them, all are sync)

DynamoDB (Sync)
- Workflows triggered as changes occur in a DynamoDB table
- Replicate DynamoDB table to another Region
Kinesis Streams (Sync): Real-time data processing

Lambda Config

Aliases (Versioning)

To version the Lambda functions: Aliases (Pointer to a specific Lambda version)

live / prod / active
blue / green
debug

Environment Variables (Config)

Use env var with Lambda: Separate code & config
Lambda enables user to dynamically pass data to function code
Key-value pairs, encrypted at rest
Encrypt with KMS before creating the function, store cyphertext as variable value
Use cases
- Log setting (INFO, DEBUG, etc)
- Dependency & database connection credentials

IAM role

Policies can be associated with IAM roles
Assign IAM execution role to Lambda functions
Source code is decoupled from the security aspect, does not need any credential check / rotation

Function permissions

Pull model event sources ONLY
- Make sure actions are permitted
- AWS provides a set of IAM roles associated with each of the pull-based event sources

Outbound Network Connectivity

Default: VPC managed by Lambda, not private connection
VPC: Communicate via ENI (Elastic Network Interface), connect to private resources
- ENIs can be assigned security groups
- Route traffic based on the route tables of ENIs’ subnets
  If you choose VPC, you need to manage:
  - Subnets, ensure multi-AZ
  - Allocate IP addresses to each subnet
  - VPC network design
  - Code start time increase, if invocation requires new ENI to be created just in time

DLQ (Dead Letter Queue)

SNS topic / SQS queue
Destination for all filed invocation events
Use DLQ if you need all Lambda invocation complete eventually, even if execution is delayed

Timeout:

Time limit for a single invocation of a Lambda function (300 s)
Sometimes need to fail fast
Should not rely on background / async processes for critical activities

Architecture Best Practices

Security

General

One IAM role per function (1 : 1 relations, decouple the IAM role)
Use temporary AWS credentials (SDK, manage retrieval & rotation)

For cross-account cases, grant execution role to AssumeRole API within STS (Security Token Service)
Store user session data in DynamoDB / ElastiCache, to reduce latency
Secrets should always only exist in memory, and never logged / written to disk
VPC security: Lambda-specific subnets, NACL, route tables

Persisting secrets

Lambda env var with encryption helpers
- Pro: Directly to runtime (no latency)
- Con: Coupled to function version
EC2 Systems Manager Parameter Store
- Pro: Decoupled from function version
- Con: Add latency (for retrieval)

API auth

API Gateway as Lambda’s event source: You have ownership to authorize & authenticate your API clients
SigV4 authentication
Lambda Authorizer

Deployment access control

UpdateFunctionCode API call: code deployment
UpdateAlias API call: code release
Eliminate direct user access to the above APIs for any function (use automation)

Reliability

HA: Subnets have adequate IPs to support many concurrent functions
Fault tolerance: Multi-region coordinates failover across all tiers of app stack
Recovery: For async, use DLQ (store during outage, process after recovery)

Performance Efficiency

If use case can be achieved async, then do not need to concern the performance

Use event InvocationType, or pull-based model
Allow application logic to proceed, while Lambda process event separately
Optimize Lambda execution time
- Resources allocation in the function configuration
- Language runtime
- The code you write (warm container reuse, minimize initial cost of cold start)
Choose optimal memory size (RAM impacts CPU time & network bandwidth)

Monitor memory usage in CW Logs

Use X-Ray to trace full lifecycle of application request, through each of its component parts.

Operational Excellence

General

Use Lambda env var to create log level var
Enable investigation with logging, use X-Ray to profile applications
Create Lambda aliases that represent operational activities such as integration testing, performance testing, debugging, etc

Metrics

Create alarm thresholds (high & low) for each Lambda function, on all provided metrics through CW
Create custom metric, and integrate directly with API required from Lambda
Capture metric with Lambda function code, and log it using provided logging mechanisms in Lambda
- Then create CW Log metric filter on the function stream, to extract the metric, and make it available in CW
- Create another Lambda as a subscription filter on the CW Log stream to push filtered log statements to another metrics

Deployment

Steps:
- Upload new function code
- Publish the new version
- update the alias
Parallel version invocations
Deployment schedule (do not choose peak time)
Rollback

Cost Optimization

Right sizing (might pay more due to longer execution time)
Distributed & async architecture (Each decoupled architecture component takes less compute time to conduct the work)
Many Lambda event sources fit well with distributed systems

Development Best Practices

1. Infrastructure as code

CF requires large amount of JSON / yaml, so we use SAM (open specification abstraction layer on top of CF)
Use SAM & CF together

2. Load testing

SAM Local to test serverless functions & apps locally (use Docker)

3. Coding

Put business logic outside the Handler

Lambda starts execution at the handler function, then it pass the parameters (event & context) to another function to parse into new vars / objects that are contextualized to your app.
Warm containers: Caching / Keepalived / Reuse

Scoping vars in a way that they & their contents can be reused on subsequent invocations.
Control dependencies
Fail fast
- Short timeout for external dependencies & Lambda overall timeout
Handling exceptions (for async)
- Some exception goes to DLQ for reprocessing
- Some just logged

4. Code Management

Code repository organization (1 : 1)
- Make sure Lambda function is independently versioned & committed to
Release branches
- Correlate Lambda function deployment with incremental commits on a release branch

5. Testing

1) Unit Test

Scope all unit tests down to a single code path, within a single logical function
Focus mostly on the business logic outside the handler function
Unit test the ability to parse mock objects for the event sources
Local test automation with SAM Local

2) Integration test

Integration test: test integration of the code to its dependencies in an env that mimics the live env
Create lower lifecycle version of the Lambda function

6. Continuous Delivery

CodeCommit: hosted private Git repos
CodePipeline: Declarative steps in the pipeline
CodeBuild: Build the code, run unit tests, and create code package
SAM: Integrate with CodeBuild, push code package to S3, and push new package to Lambda via CF
CodeStar: = Commit + Pipeline + Build. A CD toolchain, manage all aspects of the SDLC