Serverless Architecture Patterns: Build Scalable Apps Without Managing Servers
Author
Date Published
TL;DR: A practical guide to serverless architecture patterns. Covers AWS Lambda, Azure Functions, and Vercel Functions with event-driven, fan-out, and saga patterns. Includes strategies for cold starts, cost optimization, and production deployment.
Serverless computing removes cloud infrastructure management from your development workflow. You write functions, deploy them, and the platform handles scaling, patching, and availability. No servers to provision, no capacity planning, no 3 AM pages about disk space.
But serverless is not just "deploy a function." Production systems require architectural patterns that handle failures, coordinate workflows, and control costs. This guide covers the patterns that work — and the traps that catch teams who skip the architecture step.
Serverless Platforms in 2026
The serverless landscape has matured significantly. Each platform has distinct strengths — see our Vercel vs AWS comparison for a detailed breakdown.
AWS Lambda
The most feature-complete serverless platform. Supports the widest range of event sources, runtimes, and integrations.
import { APIGatewayProxyHandlerV2 } from "aws-lambda";
export const handler: APIGatewayProxyHandlerV2 = async (event) => {
const body = JSON.parse(event.body ?? "{}");
const result = await processOrder(body);
return {
statusCode: 200,
headers: { "Content-Type": "application/json" },
body: JSON.stringify(result),
};
};
Best for: complex event-driven systems, high-throughput data processing, and organizations already in the AWS ecosystem.
Azure Functions
Deep integration with the Microsoft ecosystem. Strong support for .NET, but TypeScript support has improved significantly.
import { app, HttpRequest, HttpResponseInit } from "@azure/functions";
app.http("processOrder", {
methods: ["POST"],
authLevel: "function",
handler: async (
request: HttpRequest
): Promise<HttpResponseInit> => {
const body = (await request.json()) as OrderRequest;
const result = await processOrder(body);
return { status: 200, jsonBody: result };
},
});
Best for: enterprises using Azure AD, Teams, or other Microsoft services.
Vercel Functions
The simplest serverless experience for web applications. Functions are just files in your Next.js project.
import { NextRequest, NextResponse } from "next/server";
export async function POST(request: NextRequest) {
const body = await request.json();
const result = await processOrder(body);
return NextResponse.json(result);
}
Best for: Next.js applications, JAMstack sites, and teams that want zero infrastructure configuration.
Platform Comparison
| Feature | AWS Lambda | Azure Functions | Vercel Functions |
|---------|-----------|-----------------|------------------|
| Max execution time | 15 min | 10 min (Consumption) | 30s (Hobby) / 5 min (Pro) |
| Memory | 128MB – 10GB | 1.5GB (Consumption) | 1GB – 3GB |
| Cold start (Node.js) | 200–800ms | 500–2000ms | 50–250ms |
| Event sources | 200+ AWS services | Azure services + custom | HTTP, Cron |
| Pricing model | Per-request + duration | Per-request + duration | Per-request (included in plan) |
| Local development | SAM, SST | Azure Functions Core Tools | next dev |
Pattern 1: Event-Driven Processing
The most natural serverless pattern. Functions react to events — an upload, a database change, a message on a queue — rather than waiting for HTTP requests.
Architecture
User uploads file → S3 event → Lambda: validate & process
→ SQS: thumbnail queue → Lambda: generate thumbnails
→ DynamoDB: store metadata
→ SNS: notify user
Implementation
import { S3Event } from "aws-lambda";
import { S3Client, GetObjectCommand } from "@aws-sdk/client-s3";
import { SQSClient, SendMessageCommand } from "@aws-sdk/client-sqs";
const s3 = new S3Client({});
const sqs = new SQSClient({});
export async function handler(event: S3Event) {
for (const record of event.Records) {
const bucket = record.s3.bucket.name;
const key = decodeURIComponent(record.s3.object.key);
const metadata = await extractMetadata(bucket, key);
await storeMetadata(metadata);
await sqs.send(
new SendMessageCommand({
QueueUrl: process.env.THUMBNAIL_QUEUE_URL,
MessageBody: JSON.stringify({ bucket, key, metadata }),
})
);
}
}
When to use: file processing, data pipeline triggers, notification systems, audit logging.
Pattern 2: Fan-Out / Fan-In
Parallelize work by distributing tasks across multiple function invocations, then aggregate results. This pattern turns a 10-minute sequential job into a 30-second parallel one.
Architecture
API request → Orchestrator Lambda
├── Worker Lambda 1 → Result
├── Worker Lambda 2 → Result
├── Worker Lambda 3 → Result
└── Worker Lambda N → Result
Aggregator Lambda → Combined Result → Response
Implementation with Step Functions
{
"StartAt": "FanOut",
"States": {
"FanOut": {
"Type": "Map",
"ItemsPath": "$.chunks",
"MaxConcurrency": 50,
"Iterator": {
"StartAt": "ProcessChunk",
"States": {
"ProcessChunk": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123:function:process-chunk",
"End": true
}
}
},
"Next": "Aggregate"
},
"Aggregate": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123:function:aggregate-results",
"End": true
}
}
}
When to use: batch data processing, parallel API calls, map-reduce workloads, large file transformations.
Pattern 3: Saga Pattern for Distributed Transactions
When a business operation spans multiple services — charge the payment, reserve inventory, schedule delivery — you need a way to handle partial failures. The saga pattern coordinates these steps with compensating actions for rollback.
Orchestration-Based Saga
A central orchestrator (Step Functions, Durable Functions) manages the workflow.
import { SFNClient, StartExecutionCommand } from "@aws-sdk/client-sfn";
const sfn = new SFNClient({});
export async function handler(event: OrderEvent) {
await sfn.send(
new StartExecutionCommand({
stateMachineArn: process.env.ORDER_SAGA_ARN,
input: JSON.stringify({
orderId: event.orderId,
customerId: event.customerId,
items: event.items,
total: event.total,
}),
})
);
}
The state machine definition handles the happy path and compensations:
Reserve Inventory → Charge Payment → Schedule Delivery → Confirm Order
↓ (fail) ↓ (fail) ↓ (fail)
(no-op) Release Inventory Refund Payment + Release Inventory
Choreography-Based Saga
Each service listens for events and publishes its own. No central coordinator.
export async function handlePaymentCharged(event: PaymentEvent) {
try {
await scheduleDelivery(event.orderId, event.items);
await publishEvent("delivery.scheduled", {
orderId: event.orderId,
estimatedDate: getEstimatedDate(),
});
} catch (error) {
await publishEvent("delivery.failed", {
orderId: event.orderId,
reason: (error as Error).message,
});
}
}
Orchestration vs. Choreography:
| Aspect | Orchestration | Choreography | |--------|--------------|--------------| | Complexity | Central, visible workflow | Distributed, harder to trace | | Coupling | Services know about orchestrator | Services only know events | | Debugging | Step Functions console | Distributed tracing required | | Flexibility | Add steps easily | Add listeners easily |
Cold Starts: Understanding and Mitigating
Cold starts happen when the platform spins up a new function instance. The latency comes from provisioning the execution environment, loading your code, and initializing your runtime.
Cold Start Duration by Runtime
| Runtime | Typical Cold Start | With Dependencies | |---------|-------------------|-------------------| | Node.js | 150–400ms | 300–800ms | | Python | 200–500ms | 400–1200ms | | Go | 50–100ms | 80–200ms | | Rust | 30–80ms | 50–150ms | | Java | 800–3000ms | 2000–8000ms | | .NET | 400–1500ms | 800–3000ms |
Mitigation Strategies
1. Keep bundles small. Tree-shake aggressively. Use esbuild or tsup to bundle your function into a single file with only the code it needs.
// esbuild.config.ts
import { build } from "esbuild";
await build({
entryPoints: ["src/handlers/*.ts"],
bundle: true,
platform: "node",
target: "node20",
outdir: "dist",
minify: true,
treeShaking: true,
external: ["@aws-sdk/*"],
});
2. Initialize outside the handler. Code outside the handler function runs once per cold start and is reused across warm invocations.
import { DynamoDBClient } from "@aws-sdk/client-dynamodb";
import { DynamoDBDocumentClient } from "@aws-sdk/lib-dynamodb";
// Initialized once, reused across invocations
const client = new DynamoDBClient({});
const docClient = DynamoDBDocumentClient.from(client);
export async function handler(event: APIGatewayProxyEventV2) {
// docClient is already initialized on warm starts
const result = await docClient.send(/* ... */);
return { statusCode: 200, body: JSON.stringify(result) };
}
3. Use provisioned concurrency for latency-sensitive endpoints.
# serverless.yml (Serverless Framework)
functions:
api:
handler: dist/api.handler
provisionedConcurrency: 5
events:
- httpApi:
path: /api/{proxy+}
method: ANY
4. Choose faster runtimes for latency-critical paths. Node.js is the best balance of cold start speed and developer experience for most teams.
Cost Optimization
Serverless pricing is per-invocation and per-millisecond of compute. Small optimizations compound at scale.
Pricing Breakdown (AWS Lambda)
| Component | Cost | |-----------|------| | Requests | $0.20 per 1M requests | | Duration | $0.0000166667 per GB-second | | Free tier | 1M requests + 400,000 GB-seconds/month |
Cost Optimization Strategies
| Strategy | Savings | Effort | |----------|---------|--------| | Right-size memory (power tuning) | 20–40% | Low | | Reduce execution time | 15–30% | Medium | | Use ARM64 (Graviton) | 20% | Low | | Batch processing (SQS) | 30–50% | Medium | | Cache responses (API Gateway) | 40–60% | Low | | Reserved concurrency limits | Prevents runaway costs | Low |
Power Tuning
AWS Lambda Power Tuning runs your function at different memory settings and finds the optimal price-performance point.
# Deploy the power tuning step function
sam deploy --template-file powertuning.yaml
# Run it against your function
aws stepfunctions start-execution \
--state-machine-arn $POWER_TUNING_ARN \
--input '{
"lambdaARN": "arn:aws:lambda:us-east-1:123:function:my-func",
"powerValues": [128, 256, 512, 1024, 2048],
"num": 50,
"payload": "{}"
}'
Often, a function configured with 512MB runs faster and costs less than one at 128MB because it gets proportionally more CPU.
Observability in Serverless
Serverless functions are ephemeral. You cannot SSH in to debug. Observability must be built into every function from day one.
Structured Logging
import { Logger } from "@aws-lambda-powertools/logger";
const logger = new Logger({
serviceName: "order-service",
logLevel: "INFO",
});
export async function handler(event: APIGatewayProxyEventV2) {
logger.addContext({ requestId: event.requestContext.requestId });
logger.info("Processing order", {
orderId: event.pathParameters?.id,
method: event.requestContext.http.method,
});
try {
const result = await processOrder(event);
logger.info("Order processed successfully", { orderId: result.id });
return { statusCode: 200, body: JSON.stringify(result) };
} catch (error) {
logger.error("Order processing failed", { error });
return { statusCode: 500, body: JSON.stringify({ error: "Internal error" }) };
}
}
Distributed Tracing
Use AWS X-Ray, Datadog APM, or OpenTelemetry to trace requests across function invocations and services.
import { Tracer } from "@aws-lambda-powertools/tracer";
const tracer = new Tracer({ serviceName: "order-service" });
export async function handler(event: APIGatewayProxyEventV2) {
const segment = tracer.getSegment();
const subsegment = segment?.addNewSubsegment("processOrder");
try {
const result = await processOrder(event);
subsegment?.close();
return { statusCode: 200, body: JSON.stringify(result) };
} catch (error) {
subsegment?.addError(error as Error);
subsegment?.close();
throw error;
}
}
Cost Surprise Catalog: Known Ways Serverless Bills Explode
Every serverless project that has run for 12+ months has at least one bill spike story. The catalog below is the short list of patterns that produce most of them — plus the guardrail that prevents each one.
| Pattern | How it happens | Guardrail |
|---------|----------------|-----------|
| Recursive S3 → Lambda → S3 loop | Lambda processes an S3 upload and writes the result back to the same bucket prefix, re-triggering itself | Write outputs to a different bucket or different prefix; set a reserved concurrency cap |
| Infinite retry on a poison message | SQS message fails, Lambda retries forever, burning invocations | Configure max receive count on SQS queue; send poison messages to DLQ after 3-5 failures |
| Unbounded fan-out | Map state over a user-controlled array triggers 100K+ Lambdas | Cap MaxConcurrency in Step Functions; validate input array length before fanout |
| Memory-over-provisioned at scale | Engineer picked 3GB memory "to be safe" on a function that needed 512MB | Run AWS Lambda Power Tuning before production; re-check quarterly |
| Log-level INFO in production | Every invocation writes 20 KB of JSON logs to CloudWatch; 100M/mo = $$$ | Set log level to WARN in prod; sample verbose logs at 1-5% |
| No timeout on external API call | Downstream API hangs; Lambda runs for the full 15 minutes at max memory | Set per-call SDK timeouts (e.g., AWS SDK requestTimeout: 3000); use overall function timeout conservatively |
| Provisioned concurrency never reduced | Set Provisioned Concurrency=50 during launch; still 50 a year later at 5% utilization | Attach CloudWatch alarm on concurrency utilization; Application Auto Scaling adjusts automatically |
| API Gateway on every static asset | Serving a 500KB image through Lambda + API Gateway instead of CloudFront | Front static assets with CloudFront, S3 signed URLs, or a CDN — never API Gateway |
Budget Alerts Are Mandatory, Not Optional
Every AWS account running Lambda should have:
- A billing alert at 50% of expected monthly spend (email + Slack)
- A billing alert at 100% of expected monthly spend (page on-call)
- Per-function reserved concurrency limits on any function that could run away (S3 triggers, queue consumers, webhook handlers)
- AWS Budgets configured with anomaly detection — catches the "normally $100/day, today $12K" spikes within hours instead of at end-of-month
These four guardrails would prevent roughly 80% of the $5K-$50K overnight-bill stories teams share after the fact.
When Not to Use Serverless
Serverless is not the right answer for every workload.
| Workload | Better Alternative | Why | |----------|-------------------|-----| | Long-running jobs (>15 min) | ECS Fargate, Cloud Run | Lambda has a 15-minute timeout | | Consistent high-throughput | Containers, VMs | Cheaper at sustained load | | Stateful applications | Containers with persistent storage | Functions are stateless | | GPU workloads | GPU instances, SageMaker | No GPU support in Lambda | | WebSocket servers | ECS, App Runner, Fly.io | Lambda is request/response |
Getting Started
Serverless architecture removes operational overhead but introduces new design challenges. The patterns in this guide — event-driven processing, fan-out, and sagas — are battle-tested approaches that scale reliably in production.
If you are building a new application or migrating to serverless, reach out to our team. We design and implement serverless architectures on AWS, Azure, and Vercel — optimized for cost, performance, and maintainability.
Write the code. Let the platform handle the rest.
Frequently Asked Questions
When does serverless become more expensive than a VM?
Serverless crosses over to more expensive than an EC2 reserved instance around 30-50% sustained CPU utilization. A Lambda hitting 1M requests/sec at steady state pays 3-5x more than a right-sized Fargate or EC2 cluster. For bursty or low-volume workloads, serverless remains cheaper; for high, steady load, containers win.
What's a realistic cold start penalty in 2026?
AWS Lambda cold starts are typically 100-400ms for Node.js and Python, 500-2000ms for Java and .NET. Provisioned Concurrency removes the cold start at a cost of roughly $10-30/month per always-warm function. Cold starts matter for user-facing APIs but rarely for background jobs.
How do I handle a long-running job in a serverless world?
AWS Lambda caps at 15 minutes, Cloudflare Workers at 30 seconds CPU time. For longer work, split into Step Functions orchestrations, AWS Batch, or a Fargate task triggered by Lambda. Every team that tries to stretch Lambda to 30-minute jobs regrets it when timeout retries cause duplicate processing.
What's the #1 serverless cost surprise?
Recursive invocation bugs. A misconfigured S3 trigger that writes back to the same bucket can spawn millions of Lambdas overnight and produce a $5-50K surprise bill. Every serverless project needs budget alerts, concurrency limits per function, and a deliberate review of event-source loops before production.
Need Help Building Your Project?
From web apps and mobile apps to AI solutions and SaaS platforms — we ship production software for 300+ clients.
Related Articles
Edge Computing for Web Apps: A Practical Guide for 2026
A practical guide to edge computing for web applications. Covers edge functions, CDN compute, Cloudflare Workers, Vercel Edge, Deno Deploy, performance tradeoffs, and when to use edge versus origin servers.
11 min readKubernetes Deployment Patterns: Rolling, Blue-Green, Canary & More
A practical guide to Kubernetes deployment strategies. Covers rolling updates, blue-green deployments, canary releases, and A/B testing with production-ready YAML examples and decision criteria.
10 min readCloud Migration Strategy Guide: Planning, Execution, and Optimization
Cloud migration is more than lifting servers. This guide covers migration strategies, cost planning, security considerations, and how to avoid the pitfalls that derail most cloud initiatives.