Architecture Briefing

System Design: Steam

A tactical architecture map for a Steam-scale gaming platform: global edge delivery, identity and ownership, commerce, content pipelines, multiplayer coordination, data stores, observability, and elastic capacity controls for a billion-user footprint.

1B+ accounts Global edge regions Hot path cache first Elastic surge capacity
1B User Strategy

Do not route every action to one core. Serve static catalog, media, manifests, depot chunks, and community reads from regional edge caches while keeping accounts, payments, and entitlements strongly controlled.

Scale Up

Traffic manager shifts users to healthy regions, autoscalers add API/game-service workers, queues absorb spikes, CDN capacity expands by depot popularity, and hot data moves into cache.

Scale Down

Cooldown windows drain queues, remove idle workers, lower cache prewarm, shrink batch fleets, and keep only baseline regional capacity plus standby failover.

Blast-Radius Control

Shard accounts and inventories, isolate store traffic from multiplayer traffic, roll out with feature flags, and route around bad POPs or degraded data replicas.

Operation Map

Steam-Scale Platform Battlefield

Clients Edge Control Data Live routes
01

Clients and Creators

Billions of sessions start here, but most reads stay cached.

Steam Client Store, library, downloads, friends, cloud sync Cache manifests locally
Game Client Steamworks SDK, auth tickets, lobbies, inventory Short-lived session tickets
Partner Tools Build upload, release branches, store setup Controlled publishing lanes
02

Global Traffic Edge

Absorbs reads, routes around regional failure, sheds abusive load.

Global Traffic Manager Geo DNS, health routing, brownout policies Region failover
API Gateway Auth handoff, request shaping, routing Per-service limits
Edge Cache Store pages, media, reviews, public metadata Cache-first reads
Rate Limit and WAF Abuse filters, bot throttles, DDoS controls Protect origin
03

Trust and Commerce

Authoritative path for identity, money, licenses, and policy.

Identity Service Login, 2FA, sessions, auth tickets Partition by account
Catalog Service Apps, packages, prices, localization Read replicas + cache
Commerce Core Cart, wallet, payments, refunds, fraud checks Strict consistency
Ownership Ledger Licenses, entitlements, family/device rules Audit trail
04

Content Supply Line

Turns developer builds into regional chunks and manifests.

Build Ingest SteamPipe uploads from developers Background validation
Depot Manifest Service Apps, depots, chunks, branches, beta lanes Immutable versions
SteamPipe CDN Depot downloads, patches, regional capacity Scale by hot depots
Prewarm Planner Forecasts launch spikes and seeds edge capacity Event-driven scale up
05

Community and UGC

High-read social surfaces isolated from payment hot paths.

Community API Profiles, groups, reviews, discussions Eventually consistent
Workshop and UGC Items, metadata, moderation, subscriptions Object-backed content
Moderation Pipeline Reports, reputation, fraud and safety signals Async review queues
06

Multiplayer Command

Session control scales separately from real-time relay traffic.

Friends and Presence Chat, invites, rich presence, party joins Fanout service
Lobbies and Matchmaking Lobby state, filters, invites, session setup Regional pools
Server Directory Heartbeats, browser listings, capacity TTL-based state
SDR Control Relay tickets, route assignment, privacy Route quality loop
SDR Relay POPs Packet relay, DDoS shielding, path optimization Scale by bandwidth
07

Data Backbone

Shard writes, replicate reads, and keep hot state in memory.

Account Shards Accounts, purchases, licenses, item state Shard by user/account
Read Replicas Catalog, profile, inventory and entitlement reads Scale read QPS
Object Storage Builds, screenshots, Workshop objects Multi-region durable
Event Bus and Queues Purchases, telemetry, moderation, fanout jobs Absorb spikes
08

Elastic War Room

Decides when to add, drain, fail over, or brown out capacity.

Capacity Controller Autoscaling, CDN prewarm, queue depth policies Scale up/down loop
Telemetry Lake Traffic, errors, revenue, abuse, relay quality Real-time signals
Feature Flags Canaries, region rollouts, kill switches Limit blast radius
Batch and Recompute Recommendations, reports, fraud models, cleanup Cheap off-peak work
Purchase Route

Edge gateway checks identity, catalog and commerce, commits the license to account shards, publishes purchase events, then warms read replicas for launch.

Download Route

Builds become immutable depots and manifests. Popular depots are forecast, prewarmed, and served from CDN/edge capacity instead of the origin.

Multiplayer Route

Friends, lobbies, server heartbeats, and SDR ticketing coordinate the match while relay POPs independently scale bandwidth and route quality.

How It Supports 1B Users

The architecture separates ultra-hot read paths from authoritative write paths. Store pages, media, depots, manifests, profiles, reviews, and public metadata are cached at the edge. Identity, wallet, ownership, inventory and payment writes stay behind stricter control planes with partitioned account shards and audit logs.

How It Scales Up

Traffic spikes trigger more gateway workers, more regional service pools, larger relay bandwidth, wider CDN cache fill, and additional queue consumers. Launch events can prewarm depots and temporarily reserve capacity before players arrive.

How It Scales Down

After queue depth, request rate, and relay utilization drop, the capacity controller drains workers, removes cache prewarm, shifts batch jobs to cheaper windows, and returns regions to baseline plus failover reserves.

Vercel Strategy

Vercel turns Git commits into immutable deployments, then routes global traffic through an edge network that can serve cached static assets, invoke serverless compute, or stream dynamic responses.

Scale Up

Popular pages move to cache, serverless/Fluid Compute workers scale by request load, ISR and data caches reduce origin pressure, and deployments stay immutable so new versions do not mutate old traffic.

Scale Down

Idle compute instances drain, cache entries age out, preview deployments sit cold until requested, and traffic can shift back to stable production aliases without rebuilding.

Release Safety

Preview URLs, atomic aliases, instant rollback, rolling releases, protection checks, observability, and firewall controls keep experiments isolated from production users.

Cloud Operation Map

Vercel Platform Architecture

Developers/users Edge network Build/deploy Data/cache Live routes
V1

Developer Inputs

Source changes enter through Git, CLI, APIs, and project settings.

Git Provider Webhooks GitHub, GitLab, Bitbucket pushes and pull requests Commit-triggered deploys
Vercel CLI and API Manual deploys, environment variables, project automation Programmable control
Dashboard and Teams Project config, domains, access, usage, rollbacks Human control plane
V2

Build Control Plane

Framework detection and build isolation create immutable outputs.

Build Orchestrator Queues jobs, selects builders, applies project settings Elastic build fleet
Build Cache Dependency cache, framework cache, incremental artifacts Faster repeat builds
Build Output API Static assets, routes, functions, ISR metadata Platform contract
Deployment Artifact Store Immutable assets and function bundles per deployment Versioned release unit
V3

Release Layer

Deployments become URLs, aliases, previews, and guarded rollouts.

Deployment Registry Preview, production, aliases, metadata, status Immutable deployments
Domains and Routing Config Custom domains, project aliases, route rules, redirects Atomic traffic switch
Rollouts and Rollbacks Instant rollback, rolling releases, checks, protection Safe release valve
V4

Global Edge Network

First request landing zone for cache, routing, security, and compute.

Edge Router PoP routing, TLS, domains, route matching, rewrites Nearest healthy edge
Vercel CDN Static assets, image outputs, ISR pages, cache keys Cache-first delivery
Firewall and Protection DDoS protection, WAF, bot controls, deployment protection Block before origin
V5

Compute Plane

Dynamic requests split across edge code and regional serverless compute.

Middleware and Edge Functions Lightweight routing, auth, personalization near users Low-latency decisions
Functions and Fluid Compute Server-side rendering, APIs, streaming, background work Request-driven scale
Image Optimization Transforms, responsive variants, cacheable outputs Edge-cacheable media
V6

Data and Cache Plane

Platform cache and managed stores reduce function and origin work.

Data Cache and ISR Revalidation, stale-while-revalidate, prerendered pages Serve stale, refresh async
Edge Config Global low-latency config and feature flags Read at the edge
Managed Storage Postgres, KV, Blob, external databases and APIs Data close to compute
V7

Observability and Governance

Feedback loop for performance, reliability, cost, and security.

Logs, Traces, Analytics Runtime logs, web analytics, speed insights, errors Operational signal
Usage and Limits Quotas, spend controls, concurrency, rate shaping Cost guardrails
Access and Compliance Teams, SSO, audit logs, environment scopes Enterprise controls
V8

User Traffic

Requests resolve to cached assets, edge logic, or compute.

Global Users Browsers, crawlers, APIs, mobile clients Latency-sensitive
Preview Reviewers PR previews, branch URLs, protected environments Before production
External Origins SaaS APIs, databases, object stores, legacy services Fallback dependency

Vercel Request Path

A request enters the nearest edge, passes protection and route matching, then either serves a cached asset/page, runs middleware, invokes Functions or Fluid Compute, or reaches a managed/external datastore. Cacheable results feed back into CDN and data cache.

Vercel Deploy Path

Git or CLI events enqueue a build, reuse build cache, produce Build Output API artifacts, register an immutable deployment, create preview URLs, then atomically move aliases for production.

Vercel Scaling Path

Static assets and ISR pages scale through the CDN. Dynamic traffic scales through request-driven compute and regional capacity. When load falls, idle compute drains while cached outputs remain cheap to serve until eviction or revalidation.

Core Idea

Use Vercel as the global experience layer: store, web app, marketing, community reads, preview builds, dashboards, edge routing, cached pages, and lightweight APIs.

External Core

Keep authoritative systems outside Vercel: identity ledger, wallet/payment, entitlements, inventory, matchmaking, game relays, content depot storage, and event streaming.

1B User Shape

Most users hit Vercel edge cache and CDN. Only authenticated writes, purchases, license checks, matchmaking, and downloads cross into regional platform services.

Scale Rule

Scale web reads through Vercel, scale game/content systems through dedicated regional backends, and connect them with async events plus strict ownership APIs.

Combined Target Architecture

Steam-Like Platform Built on Vercel

Users/devs Vercel layer Core services Dedicated data Critical routes
H1

Player and Developer Entry

Browsers, launchers, games, and creators enter through separate paths.

Web and Mobile Users Store browsing, profiles, reviews, library web views Vercel first
Game Launcher Downloads, ownership checks, friends, cloud sync Hybrid client
Game Client SDK Auth ticket, lobby, inventory, relay access Backend direct
Creators and Publishers Game builds, branches, store assets, pricing Vercel dashboard + ingest
H2

Vercel Experience Layer

Vercel owns global web delivery, previews, and cache-heavy surfaces.

Vercel Edge Router Domains, TLS, routing, middleware, firewall, bot controls Global web front door
Vercel CDN and ISR Storefront, catalog pages, media, reviews, docs Cache-first at 1B scale
Vercel Functions BFF APIs, personalization, edge decisions, webhooks Thin orchestration
Preview Deployments Store experiments, admin tools, publisher workflows Safe UI releases
H3

API Gateway and Trust Boundary

All authoritative operations pass through strict regional gateways.

Regional API Gateway AuthN/Z, quotas, idempotency, request signing Write-path front door
Identity and Session Service Accounts, devices, 2FA, OAuth, launcher/game tickets Shard by account
Policy and Risk Engine Fraud, regional rules, rate limits, parental controls Decision service
Web BFF Contract Stable APIs consumed by Vercel Functions and clients Keep Vercel thin
H4

Commerce and Ownership Core

Money and licenses need strong consistency, audit, and rollback safety.

Catalog and Pricing Packages, SKUs, discounts, regions, taxes, metadata Cache reads aggressively
Checkout and Wallet Cart, payments, refunds, chargebacks, payment providers Exactly-once semantics
Entitlement Ledger Licenses, DLC, subscriptions, family sharing, revocation Authoritative ownership
Account Shards Users, licenses, balances, inventory pointers Multi-region replicas
H5

Content and Download Plane

Vercel serves store media; dedicated depot CDN serves game bytes.

Publisher Build Ingest Uploads, scanning, validation, branch promotion Async pipeline
Depot and Manifest Service Chunks, manifests, patches, deltas, release channels Immutable content graph
Dedicated Game CDN Large downloads, patches, launch prewarm, regional capacity Not Vercel CDN
Content Object Storage Depot chunks, screenshots, trailers, workshop files Durable origin
H6

Multiplayer and Social

Real-time game systems bypass Vercel and scale on regional capacity.

Friends, Presence, Chat Fanout, rich presence, invites, notification streams Stateful regional pools
Lobby and Matchmaking Lobby state, skill/rules matching, server assignment Region-aware pools
Game Server Directory Heartbeats, capacity, filters, server browser TTL-backed registry
Game Relay Network DDoS shield, packet relay, route quality, NAT traversal Dedicated real-time edge
H7

Data, Events and Analytics

Queues isolate spikes and let slow work happen outside requests.

Event Bus and Queues Purchases, entitlement changes, telemetry, moderation jobs Shock absorber
Read Models and Search Catalog search, recommendations, profiles, public inventory Denormalized for reads
Telemetry Lake Traffic, revenue, downloads, abuse, relay metrics Operational analytics
Moderation and Safety Reports, reputation, UGC review, fraud models Async enforcement
H8

Scaling and Operations

Vercel scales the web edge; platform controllers scale game systems.

Capacity Controller CDN prewarm, worker autoscale, relay bandwidth, queue consumers Scale up/down
Feature Flags and Rollouts Vercel previews, canaries, kill switches, region gates Blast-radius control
Observability Command Center SLIs, tracing, error budgets, business metrics, cost Control loop input
Disaster Recovery Region failover, read-only mode, queue replay, rollback Survive failures

What Vercel Should Own

Vercel is the global web surface: storefront pages, discovery, marketing, docs, community reads, preview deployments, admin UI, edge middleware, thin BFF functions, cached catalog/read pages, and safe UI releases through aliases and rollbacks.

What Must Stay Outside Vercel

Do not run the authoritative Steam-like core only on Vercel. Payments, wallet, entitlements, account shards, inventory, matchmaking, game relays, depot downloads, event streams, and large object storage need dedicated regional backends and operational control.

1B User Scaling Shape

Anonymous browsing and public reads scale through Vercel CDN and ISR. Authenticated writes go to regional core services. Large downloads use a dedicated game CDN. Multiplayer uses regional lobbies and relay POPs. Queues absorb spikes and let capacity scale down safely.

Research Sources