Scaling SaaS Infrastructure
SaaS-Infrastruktur skalieren. Von Monolith zu Microservices, Database Sharding, Caching-Strategien und Auto-Scaling mit Kubernetes implementieren.

Scaling SaaS Infrastructure
Meta-Description: SaaS-Infrastruktur skalieren. Von Monolith zu Microservices, Database Sharding, Caching-Strategien und Auto-Scaling mit Kubernetes implementieren.
Keywords: SaaS Scaling, Horizontal Scaling, Database Sharding, Kubernetes, Load Balancing, Caching, CDN, Auto-Scaling, High Availability
Einführung
SaaS Infrastructure Scaling ermöglicht Wachstum ohne Performance-Einbußen. Von horizontalem Scaling über Database Sharding bis Kubernetes Auto-Scaling – die richtige Architektur entscheidet über Skalierbarkeit. Dieser Guide zeigt bewährte Patterns für wachsende SaaS-Produkte.
Scaling Architecture Overview
┌─────────────────────────────────────────────────────────────┐
│ SAAS SCALING ARCHITECTURE │
├─────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ CDN LAYER │ │
│ │ CloudFlare / AWS CloudFront / Vercel Edge │ │
│ │ ├── Static assets (JS, CSS, images) │ │
│ │ ├── Edge caching for API responses │ │
│ │ └── DDoS protection │ │
│ └─────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ LOAD BALANCER │ │
│ │ AWS ALB / Nginx / Traefik │ │
│ │ ├── SSL termination │ │
│ │ ├── Health checks │ │
│ │ ├── Rate limiting │ │
│ │ └── Geographic routing │ │
│ └─────────────────────────────────────────────────────┘ │
│ │ │
│ ┌────────────────┼────────────────┐ │
│ ▼ ▼ ▼ │
│ ┌───────────┐ ┌───────────┐ ┌───────────┐ │
│ │ App Pod │ │ App Pod │ │ App Pod │ │
│ │ (API) │ │ (API) │ │ (API) │ │
│ └───────────┘ └───────────┘ └───────────┘ │
│ │ │ │ │
│ └────────────────┼────────────────┘ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ CACHE LAYER │ │
│ │ Redis Cluster / Memcached │ │
│ │ ├── Session storage │ │
│ │ ├── Query caching │ │
│ │ ├── Rate limit counters │ │
│ │ └── Real-time features (pub/sub) │ │
│ └─────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ DATABASE LAYER │ │
│ │ │ │
│ │ ┌──────────────┐ ┌──────────────┐ │ │
│ │ │ Primary │──│ Replica │ │ │
│ │ │ (Write) │ │ (Read) │ │ │
│ │ └──────────────┘ └──────────────┘ │ │
│ │ │ │ │
│ │ ▼ (at scale) │ │
│ │ ┌──────────────────────────────────────────┐ │ │
│ │ │ SHARDED DATABASE │ │ │
│ │ │ Shard 1 │ Shard 2 │ Shard 3 │ Shard N │ │ │
│ │ │ (A-G) │ (H-N) │ (O-T) │ (U-Z) │ │ │
│ │ └──────────────────────────────────────────┘ │ │
│ │ │ │
│ └─────────────────────────────────────────────────────┘ │
│ │
│ Background Jobs: Redis Queue / BullMQ / SQS │
│ Search: Elasticsearch / Meilisearch / Algolia │
│ Storage: S3 / R2 / GCS │
│ │
└─────────────────────────────────────────────────────────────┘Scaling Stages
┌─────────────────────────────────────────────────────────────┐
│ SAAS SCALING JOURNEY │
├─────────────────────────────────────────────────────────────┤
│ │
│ STAGE 1: Single Server (0-1K users) │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ App + DB on one server │ │
│ │ ✓ Simple deployment │ │
│ │ ✗ Single point of failure │ │
│ └─────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ STAGE 2: Separate DB (1K-10K users) │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ App Server ──► Managed Database (RDS, PlanetScale) │ │
│ │ ✓ Independent scaling │ │
│ │ ✓ Automatic backups │ │
│ │ + Add Redis for sessions/caching │ │
│ └─────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ STAGE 3: Horizontal Scaling (10K-100K users) │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ Load Balancer ──► Multiple App Instances │ │
│ │ ──► Read Replicas │ │
│ │ ✓ No single point of failure │ │
│ │ ✓ Zero-downtime deployments │ │
│ │ + CDN for static assets │ │
│ └─────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ STAGE 4: Microservices (100K+ users) │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ API Gateway ──► Service Mesh │ │
│ │ ──► Kubernetes │ │
│ │ ──► Database per service │ │
│ │ ✓ Independent deployment & scaling │ │
│ │ ✓ Technology flexibility │ │
│ │ + Database sharding │ │
│ └─────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────┘Database Read Replicas
// lib/db/replica.ts
import { PrismaClient } from '@prisma/client';
// Primary for writes
export const primaryDb = new PrismaClient({
datasources: {
db: { url: process.env.DATABASE_URL }
}
});
// Replica for reads
export const replicaDb = new PrismaClient({
datasources: {
db: { url: process.env.DATABASE_REPLICA_URL }
}
});
// Smart routing
interface DbOptions {
write?: boolean;
}
export function getDb(options: DbOptions = {}): PrismaClient {
return options.write ? primaryDb : replicaDb;
}
// Usage examples
export async function getUsers() {
// Read from replica
return getDb().user.findMany();
}
export async function createUser(data: UserCreateInput) {
// Write to primary
return getDb({ write: true }).user.create({ data });
}
// Read-after-write consistency helper
export async function createUserWithRead(data: UserCreateInput) {
const user = await primaryDb.user.create({ data });
// For immediate read-after-write, use primary
// Replica lag is typically <100ms
return user;
}Prisma with Read Replicas Extension
// lib/db/prisma-replicas.ts
import { PrismaClient } from '@prisma/client';
import { readReplicas } from '@prisma/extension-read-replicas';
export const db = new PrismaClient().$extends(
readReplicas({
url: process.env.DATABASE_REPLICA_URL!,
// Optional: multiple replicas for load distribution
// url: [
// process.env.DATABASE_REPLICA_URL_1!,
// process.env.DATABASE_REPLICA_URL_2!
// ]
})
);
// All reads automatically go to replica
// All writes automatically go to primary
// Force primary for specific reads (read-after-write)
export async function getUserAfterUpdate(userId: string) {
return db.$primary().user.findUnique({
where: { id: userId }
});
}Caching Strategy
// lib/cache/redis.ts
import Redis from 'ioredis';
const redis = new Redis(process.env.REDIS_URL!);
interface CacheOptions {
ttl?: number; // seconds
tags?: string[];
}
export async function cache<T>(
key: string,
fetcher: () => Promise<T>,
options: CacheOptions = {}
): Promise<T> {
const { ttl = 3600 } = options;
// Try cache first
const cached = await redis.get(key);
if (cached) {
return JSON.parse(cached);
}
// Fetch fresh data
const data = await fetcher();
// Store in cache
await redis.setex(key, ttl, JSON.stringify(data));
// Track tags for invalidation
if (options.tags) {
for (const tag of options.tags) {
await redis.sadd(`tag:${tag}`, key);
}
}
return data;
}
export async function invalidateByTag(tag: string): Promise<void> {
const keys = await redis.smembers(`tag:${tag}`);
if (keys.length > 0) {
await redis.del(...keys);
await redis.del(`tag:${tag}`);
}
}
// Cache-aside pattern for database queries
export async function cachedQuery<T>(
queryKey: string,
query: () => Promise<T>,
ttl: number = 300
): Promise<T> {
return cache(queryKey, query, { ttl });
}
// Usage
export async function getProjectsByTenant(tenantId: string) {
return cachedQuery(
`tenant:${tenantId}:projects`,
() => db.project.findMany({
where: { tenantId },
orderBy: { updatedAt: 'desc' }
}),
600 // 10 minutes
);
}Multi-Level Caching
// lib/cache/multi-level.ts
import { LRUCache } from 'lru-cache';
import Redis from 'ioredis';
const redis = new Redis(process.env.REDIS_URL!);
// L1: In-memory cache (per instance)
const memoryCache = new LRUCache<string, any>({
max: 1000,
ttl: 60 * 1000 // 1 minute
});
// L2: Redis (shared)
interface MultiLevelCacheOptions {
memoryTtl?: number; // ms
redisTtl?: number; // seconds
}
export async function multiLevelCache<T>(
key: string,
fetcher: () => Promise<T>,
options: MultiLevelCacheOptions = {}
): Promise<T> {
const { memoryTtl = 60_000, redisTtl = 300 } = options;
// L1: Check memory
const memoryHit = memoryCache.get(key);
if (memoryHit !== undefined) {
return memoryHit as T;
}
// L2: Check Redis
const redisHit = await redis.get(key);
if (redisHit) {
const data = JSON.parse(redisHit);
memoryCache.set(key, data, { ttl: memoryTtl });
return data;
}
// L3: Fetch from source
const data = await fetcher();
// Populate both caches
memoryCache.set(key, data, { ttl: memoryTtl });
await redis.setex(key, redisTtl, JSON.stringify(data));
return data;
}
// Cache invalidation across instances via pub/sub
const subscriber = new Redis(process.env.REDIS_URL!);
subscriber.subscribe('cache:invalidate');
subscriber.on('message', (channel, key) => {
if (channel === 'cache:invalidate') {
memoryCache.delete(key);
}
});
export async function invalidate(key: string): Promise<void> {
memoryCache.delete(key);
await redis.del(key);
// Notify other instances
await redis.publish('cache:invalidate', key);
}Database Connection Pooling
// lib/db/pool.ts
import { Pool, PoolConfig } from 'pg';
const poolConfig: PoolConfig = {
connectionString: process.env.DATABASE_URL,
// Pool sizing
min: 2,
max: 20, // max connections per instance
// Timeouts
connectionTimeoutMillis: 5000,
idleTimeoutMillis: 30000,
// Query timeout
statement_timeout: 30000,
// Application name for monitoring
application_name: 'saas-app'
};
export const pool = new Pool(poolConfig);
// Health check
export async function checkDbHealth(): Promise<boolean> {
try {
const result = await pool.query('SELECT 1');
return result.rows.length > 0;
} catch {
return false;
}
}
// Connection pool monitoring
pool.on('connect', () => {
console.log('New connection established');
});
pool.on('error', (err) => {
console.error('Unexpected error on idle client', err);
});
// Graceful shutdown
process.on('SIGTERM', async () => {
await pool.end();
});Database Sharding
// lib/db/sharding.ts
import { PrismaClient } from '@prisma/client';
import crypto from 'crypto';
interface ShardConfig {
id: number;
url: string;
range: [number, number]; // Hash range
}
const SHARDS: ShardConfig[] = [
{ id: 0, url: process.env.SHARD_0_URL!, range: [0, 63] },
{ id: 1, url: process.env.SHARD_1_URL!, range: [64, 127] },
{ id: 2, url: process.env.SHARD_2_URL!, range: [128, 191] },
{ id: 3, url: process.env.SHARD_3_URL!, range: [192, 255] }
];
// Create client for each shard
const shardClients = new Map<number, PrismaClient>();
for (const shard of SHARDS) {
shardClients.set(
shard.id,
new PrismaClient({
datasources: { db: { url: shard.url } }
})
);
}
// Consistent hashing for shard selection
export function getShardId(tenantId: string): number {
const hash = crypto
.createHash('md5')
.update(tenantId)
.digest()[0]; // First byte (0-255)
const shard = SHARDS.find(
s => hash >= s.range[0] && hash <= s.range[1]
);
return shard?.id || 0;
}
export function getShardClient(tenantId: string): PrismaClient {
const shardId = getShardId(tenantId);
return shardClients.get(shardId)!;
}
// Usage
export async function getProjectsForTenant(tenantId: string) {
const client = getShardClient(tenantId);
return client.project.findMany({
where: { tenantId }
});
}
// Cross-shard queries (avoid if possible)
export async function globalSearch(query: string) {
const results = await Promise.all(
Array.from(shardClients.values()).map(client =>
client.project.findMany({
where: {
OR: [
{ name: { contains: query, mode: 'insensitive' } },
{ description: { contains: query, mode: 'insensitive' } }
]
},
take: 10
})
)
);
return results.flat().slice(0, 50);
}Background Job Processing
// lib/jobs/queue.ts
import { Queue, Worker, Job } from 'bullmq';
import Redis from 'ioredis';
const connection = new Redis(process.env.REDIS_URL!, {
maxRetriesPerRequest: null
});
// Define queues
export const emailQueue = new Queue('email', { connection });
export const reportQueue = new Queue('reports', { connection });
export const webhookQueue = new Queue('webhooks', { connection });
// Email worker
const emailWorker = new Worker(
'email',
async (job: Job) => {
const { to, template, data } = job.data;
await sendEmail(to, template, data);
},
{
connection,
concurrency: 10,
limiter: {
max: 100,
duration: 1000 // 100 emails per second
}
}
);
// Report worker (CPU-intensive)
const reportWorker = new Worker(
'reports',
async (job: Job) => {
const { tenantId, reportType, dateRange } = job.data;
return generateReport(tenantId, reportType, dateRange);
},
{
connection,
concurrency: 2, // Lower concurrency for heavy jobs
lockDuration: 300000 // 5 minute lock
}
);
// Webhook worker with retry
const webhookWorker = new Worker(
'webhooks',
async (job: Job) => {
const { url, payload, signature } = job.data;
const response = await fetch(url, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'X-Signature': signature
},
body: JSON.stringify(payload)
});
if (!response.ok) {
throw new Error(`Webhook failed: ${response.status}`);
}
},
{
connection,
concurrency: 20,
defaultJobOptions: {
attempts: 5,
backoff: {
type: 'exponential',
delay: 1000 // 1s, 2s, 4s, 8s, 16s
}
}
}
);
// Add jobs
export async function queueEmail(
to: string,
template: string,
data: Record<string, any>
) {
return emailQueue.add('send', { to, template, data });
}
export async function queueReport(
tenantId: string,
reportType: string,
dateRange: { start: Date; end: Date }
) {
return reportQueue.add(
'generate',
{ tenantId, reportType, dateRange },
{ priority: 10 }
);
}
// Scheduled jobs
export async function setupScheduledJobs() {
// Daily cleanup
await emailQueue.add(
'cleanup',
{},
{
repeat: { cron: '0 2 * * *' } // 2 AM daily
}
);
// Hourly metrics
await reportQueue.add(
'metrics',
{},
{
repeat: { cron: '0 * * * *' } // Every hour
}
);
}Kubernetes Auto-Scaling
# k8s/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: saas-api
spec:
replicas: 3
selector:
matchLabels:
app: saas-api
template:
metadata:
labels:
app: saas-api
spec:
containers:
- name: api
image: saas-api:latest
ports:
- containerPort: 3000
resources:
requests:
memory: "256Mi"
cpu: "250m"
limits:
memory: "512Mi"
cpu: "500m"
env:
- name: NODE_ENV
value: "production"
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: db-secrets
key: url
livenessProbe:
httpGet:
path: /health
port: 3000
initialDelaySeconds: 10
periodSeconds: 10
readinessProbe:
httpGet:
path: /ready
port: 3000
initialDelaySeconds: 5
periodSeconds: 5
---
# Horizontal Pod Autoscaler
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: saas-api-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: saas-api
minReplicas: 3
maxReplicas: 50
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
behavior:
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Percent
value: 10
periodSeconds: 60
scaleUp:
stabilizationWindowSeconds: 0
policies:
- type: Percent
value: 100
periodSeconds: 15
- type: Pods
value: 4
periodSeconds: 15
selectPolicy: Max
---
# Pod Disruption Budget
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: saas-api-pdb
spec:
minAvailable: 2
selector:
matchLabels:
app: saas-apiHealth Check Endpoints
// app/api/health/route.ts
import { NextResponse } from 'next/server';
import { db } from '@/lib/db';
import Redis from 'ioredis';
const redis = new Redis(process.env.REDIS_URL!);
// Liveness probe - is the app running?
export async function GET() {
return NextResponse.json({ status: 'ok' });
}
// app/api/ready/route.ts
// Readiness probe - can the app handle traffic?
export async function GET() {
const checks = await Promise.allSettled([
checkDatabase(),
checkRedis(),
checkExternalServices()
]);
const results = {
database: checks[0].status === 'fulfilled',
redis: checks[1].status === 'fulfilled',
external: checks[2].status === 'fulfilled'
};
const allHealthy = Object.values(results).every(Boolean);
return NextResponse.json(
{ status: allHealthy ? 'ready' : 'degraded', checks: results },
{ status: allHealthy ? 200 : 503 }
);
}
async function checkDatabase(): Promise<boolean> {
try {
await db.$queryRaw`SELECT 1`;
return true;
} catch {
return false;
}
}
async function checkRedis(): Promise<boolean> {
try {
await redis.ping();
return true;
} catch {
return false;
}
}
async function checkExternalServices(): Promise<boolean> {
// Check critical external dependencies
try {
const response = await fetch('https://api.stripe.com/v1/health', {
method: 'HEAD',
signal: AbortSignal.timeout(5000)
});
return response.ok;
} catch {
return true; // Don't fail readiness for optional services
}
}Rate Limiting at Scale
// lib/rate-limit/distributed.ts
import Redis from 'ioredis';
const redis = new Redis(process.env.REDIS_URL!);
interface RateLimitConfig {
windowMs: number;
maxRequests: number;
}
const TIERS: Record<string, RateLimitConfig> = {
free: { windowMs: 60000, maxRequests: 60 }, // 60/min
pro: { windowMs: 60000, maxRequests: 600 }, // 600/min
enterprise: { windowMs: 60000, maxRequests: 6000 } // 6000/min
};
interface RateLimitResult {
allowed: boolean;
remaining: number;
resetAt: number;
}
export async function checkRateLimit(
identifier: string,
tier: string = 'free'
): Promise<RateLimitResult> {
const config = TIERS[tier] || TIERS.free;
const key = `ratelimit:${identifier}`;
const now = Date.now();
const windowStart = now - config.windowMs;
// Sliding window using sorted set
const pipeline = redis.pipeline();
// Remove old entries
pipeline.zremrangebyscore(key, 0, windowStart);
// Count current window
pipeline.zcard(key);
// Add current request
pipeline.zadd(key, now.toString(), `${now}:${Math.random()}`);
// Set expiry
pipeline.pexpire(key, config.windowMs);
const results = await pipeline.exec();
const currentCount = (results?.[1]?.[1] as number) || 0;
const allowed = currentCount < config.maxRequests;
const remaining = Math.max(0, config.maxRequests - currentCount - 1);
const resetAt = now + config.windowMs;
return { allowed, remaining, resetAt };
}
// Middleware
export async function rateLimitMiddleware(
request: Request,
identifier: string,
tier: string
): Promise<Response | null> {
const result = await checkRateLimit(identifier, tier);
if (!result.allowed) {
return new Response(
JSON.stringify({ error: 'Rate limit exceeded' }),
{
status: 429,
headers: {
'Content-Type': 'application/json',
'X-RateLimit-Limit': TIERS[tier]?.maxRequests.toString() || '60',
'X-RateLimit-Remaining': '0',
'X-RateLimit-Reset': result.resetAt.toString(),
'Retry-After': Math.ceil((result.resetAt - Date.now()) / 1000).toString()
}
}
);
}
return null; // Allow request
}CDN & Edge Caching
// next.config.js
/** @type {import('next').NextConfig} */
const nextConfig = {
// Static assets
images: {
remotePatterns: [
{ protocol: 'https', hostname: 'cdn.example.com' }
],
minimumCacheTTL: 60 * 60 * 24 * 30 // 30 days
},
// Cache headers
async headers() {
return [
{
source: '/api/:path*',
headers: [
{
key: 'Cache-Control',
value: 'no-store, must-revalidate'
}
]
},
{
source: '/_next/static/:path*',
headers: [
{
key: 'Cache-Control',
value: 'public, max-age=31536000, immutable'
}
]
},
{
source: '/images/:path*',
headers: [
{
key: 'Cache-Control',
value: 'public, max-age=86400, stale-while-revalidate=604800'
}
]
}
];
}
};
module.exports = nextConfig;
// Edge-cached API response
// app/api/products/route.ts
import { NextResponse } from 'next/server';
export const runtime = 'edge';
export const revalidate = 60; // ISR: revalidate every 60 seconds
export async function GET() {
const products = await fetchProducts();
return NextResponse.json(products, {
headers: {
'Cache-Control': 'public, s-maxage=60, stale-while-revalidate=300'
}
});
}Monitoring & Observability
// lib/monitoring/metrics.ts
import { Counter, Histogram, Registry } from 'prom-client';
export const registry = new Registry();
// Request metrics
export const httpRequestsTotal = new Counter({
name: 'http_requests_total',
help: 'Total HTTP requests',
labelNames: ['method', 'path', 'status'],
registers: [registry]
});
export const httpRequestDuration = new Histogram({
name: 'http_request_duration_seconds',
help: 'HTTP request duration',
labelNames: ['method', 'path'],
buckets: [0.01, 0.05, 0.1, 0.5, 1, 2, 5],
registers: [registry]
});
// Database metrics
export const dbQueryDuration = new Histogram({
name: 'db_query_duration_seconds',
help: 'Database query duration',
labelNames: ['operation', 'table'],
buckets: [0.001, 0.005, 0.01, 0.05, 0.1, 0.5, 1],
registers: [registry]
});
// Business metrics
export const activeUsers = new Counter({
name: 'active_users_total',
help: 'Total active users',
labelNames: ['tier'],
registers: [registry]
});
// Prometheus endpoint
// app/api/metrics/route.ts
import { registry } from '@/lib/monitoring/metrics';
export async function GET() {
const metrics = await registry.metrics();
return new Response(metrics, {
headers: {
'Content-Type': registry.contentType
}
});
}Graceful Degradation
// lib/resilience/circuit-breaker.ts
interface CircuitBreakerOptions {
failureThreshold: number;
resetTimeout: number;
}
type CircuitState = 'CLOSED' | 'OPEN' | 'HALF_OPEN';
export class CircuitBreaker {
private state: CircuitState = 'CLOSED';
private failureCount = 0;
private lastFailureTime?: number;
constructor(private options: CircuitBreakerOptions) {}
async execute<T>(fn: () => Promise<T>, fallback?: () => T): Promise<T> {
if (this.state === 'OPEN') {
if (this.shouldAttemptReset()) {
this.state = 'HALF_OPEN';
} else if (fallback) {
return fallback();
} else {
throw new Error('Circuit breaker is OPEN');
}
}
try {
const result = await fn();
this.onSuccess();
return result;
} catch (error) {
this.onFailure();
if (fallback) {
return fallback();
}
throw error;
}
}
private onSuccess(): void {
this.failureCount = 0;
this.state = 'CLOSED';
}
private onFailure(): void {
this.failureCount++;
this.lastFailureTime = Date.now();
if (this.failureCount >= this.options.failureThreshold) {
this.state = 'OPEN';
}
}
private shouldAttemptReset(): boolean {
return (
this.lastFailureTime !== undefined &&
Date.now() - this.lastFailureTime >= this.options.resetTimeout
);
}
}
// Usage
const searchCircuit = new CircuitBreaker({
failureThreshold: 5,
resetTimeout: 30000 // 30 seconds
});
export async function searchProducts(query: string) {
return searchCircuit.execute(
() => elasticSearch.search(query),
() => db.product.findMany({
where: { name: { contains: query } },
take: 20
})
);
}Best Practices
| Aspect | Recommendation |
|---|---|
| **Caching** | Multi-level: Memory → Redis → DB |
| **Database** | Read replicas, then sharding |
| **Stateless** | Session in Redis, not in memory |
| **Jobs** | Async for anything >100ms |
| **Monitoring** | Metrics, logs, traces |
| **Graceful** | Circuit breakers, fallbacks |
Fazit
SaaS Infrastructure Scaling erfordert:
- Horizontal First: Stateless Apps, Load Balancing
- Caching: Multi-Level, Invalidation-Strategien
- Database: Replicas, Pooling, später Sharding
- Resilience: Circuit Breaker, Fallbacks
- Observability: Metrics, Logging, Alerting
Skalierung ist ein kontinuierlicher Prozess – immer den nächsten Engpass addressieren.
Bildprompts
- "SaaS scaling architecture diagram, multiple layers visualization"
- "Database sharding illustration, data distribution across nodes"
- "Kubernetes auto-scaling dashboard, pod metrics and scaling events"