Performance Optimization Guide

Comprehensive performance tuning for production workloads to achieve <200ms API response times

Comprehensive performance tuning for production workloads

Target: Achieve <200ms API response times, 99.5% uptime, efficient resource utilization Audience: DevOps engineers, performance engineers, technical leads Prerequisites: Infrastructure deployed per ARCHITECTURE.md

Performance Baseline
Database Optimization
Redis Cache Optimization
Application Layer Optimization
Network & CDN Optimization
Server Resource Tuning
Monitoring & Profiling

Performance Baseline

Establish Current Metrics

Before optimizing, establish baseline metrics:

Response Time Benchmarks:

# Health endpoint
ab -n 1000 -c 10 https://app.your-domain.com/api/health
# Record: Requests/sec, Mean response time, 95th percentile

# API endpoint (authenticated)
ab -n 500 -c 10 -H "Authorization: Bearer TOKEN" https://api.your-domain.com/api/kindness
# Record: Requests/sec, Mean response time, 95th percentile

# Database query performance
docker exec ripplecore-postgres psql -U ripplecore -d ripplecore -c "SELECT COUNT(*) FROM users;"
# Record: Execution time

Resource Utilization:

# CPU and RAM usage (check Netdata)
curl http://localhost:19999/api/v1/data?chart=system.cpu
curl http://localhost:19999/api/v1/data?chart=system.ram

# Database connections
docker exec ripplecore-postgres psql -U ripplecore -c "SELECT count(*) FROM pg_stat_activity;"

# Redis memory usage
docker exec ripplecore-redis redis-cli INFO memory | grep used_memory_human

Performance Targets (from PRD):

API response time: <200ms (p95)
Health checks: <100ms
Page load time: <1s (First Contentful Paint)
Database queries: <50ms (simple), <200ms (complex)
Uptime: 99.5% (3.6 hours/month downtime acceptable)

Database Optimization

PostgreSQL Configuration Tuning

Recommended Settings (CPX22 - 4GB RAM):

File: Create /etc/postgresql/postgresql.conf or use Docker environment variables

# ============================================================================
# MEMORY CONFIGURATION
# ============================================================================

# Shared buffers: 25% of total RAM (1GB for 4GB server)
shared_buffers = 1GB

# Effective cache size: 75% of total RAM (3GB for 4GB server)
# Helps query planner estimate cache availability
effective_cache_size = 3GB

# Work memory: RAM for sort/hash operations (per connection)
# Formula: (Total RAM - shared_buffers) / max_connections
# (4GB - 1GB) / 100 = 30MB
work_mem = 30MB

# Maintenance work memory: For VACUUM, CREATE INDEX
# 10% of RAM
maintenance_work_mem = 400MB

# ============================================================================
# QUERY PLANNER
# ============================================================================

# Random page cost (SSD = 1.0, HDD = 4.0)
random_page_cost = 1.0

# Effective IO concurrency (for SSDs)
effective_io_concurrency = 200

# ============================================================================
# WRITE-AHEAD LOG (WAL)
# ============================================================================

# WAL buffers: 3% of shared_buffers
wal_buffers = 32MB

# Checkpoint completion target (smooth checkpoints)
checkpoint_completion_target = 0.9

# Max WAL size before checkpoint (larger = fewer checkpoints)
max_wal_size = 2GB
min_wal_size = 1GB

# ============================================================================
# CONNECTIONS
# ============================================================================

# Max connections (reserve some for admin)
max_connections = 100

# ============================================================================
# AUTOVACUUM TUNING
# ============================================================================

# Autovacuum to prevent table bloat
autovacuum = on
autovacuum_max_workers = 3
autovacuum_naptime = 10s  # Check for vacuum every 10s

# Vacuum threshold: vacuum when 5% of rows modified
autovacuum_vacuum_scale_factor = 0.05
autovacuum_analyze_scale_factor = 0.05

# ============================================================================
# LOGGING (for slow query detection)
# ============================================================================

# Log queries slower than 500ms
log_min_duration_statement = 500

# Log all DDL (CREATE, ALTER, DROP)
log_statement = 'ddl'

# Log checkpoints (for tuning)
log_checkpoints = on

Apply Configuration:

# Option 1: Docker environment variables (recommended)
docker run -d \
  --name ripplecore-postgres \
  -e POSTGRES_SHARED_BUFFERS=1GB \
  -e POSTGRES_EFFECTIVE_CACHE_SIZE=3GB \
  -e POSTGRES_WORK_MEM=30MB \
  -e POSTGRES_MAINTENANCE_WORK_MEM=400MB \
  -v postgres-data:/var/lib/postgresql/data \
  postgres:18-alpine \
  -c shared_buffers=1GB \
  -c effective_cache_size=3GB \
  -c work_mem=30MB \
  -c maintenance_work_mem=400MB

# Option 2: Mount custom postgresql.conf
docker run -d \
  --name ripplecore-postgres \
  -v /path/to/postgresql.conf:/etc/postgresql/postgresql.conf \
  -v postgres-data:/var/lib/postgresql/data \
  postgres:18-alpine \
  -c config_file=/etc/postgresql/postgresql.conf

# Restart PostgreSQL to apply changes
docker restart ripplecore-postgres

Index Optimization

Identify Missing Indexes:

-- Find slow queries (requires pg_stat_statements extension)
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;

-- Top 10 slowest queries
SELECT
  query,
  calls,
  total_exec_time,
  mean_exec_time,
  max_exec_time
FROM pg_stat_statements
ORDER BY mean_exec_time DESC
LIMIT 10;

-- Find sequential scans on large tables (candidates for indexing)
SELECT
  schemaname,
  tablename,
  seq_scan,
  seq_tup_read,
  idx_scan,
  seq_tup_read / seq_scan AS avg_seq_read
FROM pg_stat_user_tables
WHERE seq_scan > 0
ORDER BY seq_tup_read DESC
LIMIT 10;

Add Strategic Indexes:

-- Multi-tenant queries (CRITICAL - already in schema)
CREATE INDEX CONCURRENTLY idx_kindness_company_created
  ON kindness (company_id, created_at DESC);

CREATE INDEX CONCURRENTLY idx_volunteer_company_created
  ON volunteer (company_id, created_at DESC);

-- User lookups
CREATE INDEX CONCURRENTLY idx_users_email
  ON users (email);

-- Organization membership queries
CREATE INDEX CONCURRENTLY idx_members_org_user
  ON members (organization_id, user_id);

-- Composite index for filtered queries
CREATE INDEX CONCURRENTLY idx_kindness_company_user_created
  ON kindness (company_id, user_id, created_at DESC);

Index Maintenance:

-- Find unused indexes (candidates for removal)
SELECT
  schemaname,
  tablename,
  indexname,
  idx_scan,
  pg_size_pretty(pg_relation_size(indexrelid)) AS index_size
FROM pg_stat_user_indexes
WHERE idx_scan = 0
  AND indexrelname NOT LIKE '%_pkey'
ORDER BY pg_relation_size(indexrelid) DESC;

-- Rebuild bloated indexes
REINDEX INDEX CONCURRENTLY idx_kindness_company_created;

-- Or rebuild all indexes for a table
REINDEX TABLE CONCURRENTLY kindness;

Connection Pooling

Implement PgBouncer (Recommended for 100+ concurrent connections):

# docker-compose.yml (on DB server)
version: "3.8"

services:
  pgbouncer:
    image: pgbouncer/pgbouncer:latest
    container_name: ripplecore-pgbouncer
    ports:
      - "6432:6432"
    environment:
      - DATABASES_HOST=ripplecore-postgres
      - DATABASES_PORT=5432
      - DATABASES_USER=ripplecore
      - DATABASES_PASSWORD=<secret>
      - DATABASES_DBNAME=ripplecore
      - PGBOUNCER_POOL_MODE=transaction
      - PGBOUNCER_MAX_CLIENT_CONN=1000
      - PGBOUNCER_DEFAULT_POOL_SIZE=25
      - PGBOUNCER_MIN_POOL_SIZE=10
      - PGBOUNCER_RESERVE_POOL_SIZE=5
    depends_on:
      - ripplecore-postgres

Update Application Connection String:

# Before (direct PostgreSQL)
DATABASE_URL=postgresql://ripplecore:<secret>@10.0.1.3:5432/ripplecore

# After (via PgBouncer)
DATABASE_URL=postgresql://ripplecore:<secret>@10.0.1.3:6432/ripplecore

Benefits:

Reduces connection overhead (reuses connections)
Supports 1000+ concurrent clients with 25 actual DB connections
Automatic connection recycling
~30% reduction in database load

Query Optimization

Use EXPLAIN ANALYZE:

-- Before optimization
EXPLAIN ANALYZE
SELECT * FROM kindness
WHERE company_id = 'company-123'
ORDER BY created_at DESC
LIMIT 10;

-- Look for:
-- - Seq Scan (bad) vs Index Scan (good)
-- - High cost numbers
-- - Large "rows" estimates

Optimize N+1 Queries (Application Layer):

// ❌ BAD: N+1 query problem
const kindness = await db.query.kindness.findMany({
  where: eq(kindness.companyId, companyId),
});

for (const k of kindness) {
  // Separate query for each kindness record
  const user = await db.query.users.findFirst({
    where: eq(users.id, k.userId),
  });
}

// ✅ GOOD: Single query with join
const kindnessWithUsers = await db.query.kindness.findMany({
  where: eq(kindness.companyId, companyId),
  with: {
    user: true, // Drizzle ORM joins automatically
  },
});

Use Database-Level Aggregations:

// ❌ BAD: Fetch all records, aggregate in application
const allKindness = await db.query.kindness.findMany({
  where: eq(kindness.companyId, companyId),
});
const count = allKindness.length;

// ✅ GOOD: Count in database
const count = await db
  .select({ count: sql<number>`count(*)` })
  .from(kindness)
  .where(eq(kindness.companyId, companyId));

Redis Cache Optimization

Redis Configuration Tuning

File: /etc/redis/redis.conf or Docker command args

# ============================================================================
# MEMORY MANAGEMENT
# ============================================================================

# Max memory: 768MB (leave 256MB for system on 1GB allocated)
maxmemory 768mb

# Eviction policy: Remove least recently used keys
maxmemory-policy allkeys-lru

# Samples for LRU (5 = good balance of performance/accuracy)
maxmemory-samples 5

# ============================================================================
# PERSISTENCE
# ============================================================================

# RDB snapshots (point-in-time backups)
save 900 1      # Save if 1 key changed in 15 min
save 300 10     # Save if 10 keys changed in 5 min
save 60 10000   # Save if 10K keys changed in 1 min

# AOF (append-only file) for durability
appendonly yes
appendfilename "appendonly.aof"
appendfsync everysec  # Sync every second (good performance/durability balance)

# AOF rewrite (compact log)
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb

# ============================================================================
# PERFORMANCE
# ============================================================================

# Disable slow commands in production (optional)
rename-command FLUSHDB ""
rename-command FLUSHALL ""
rename-command CONFIG ""

# TCP backlog (number of pending connections)
tcp-backlog 511

# Close idle connections after 5 minutes
timeout 300

# Disable slow replication features if not using replicas
repl-diskless-sync no

Apply Configuration:

docker run -d \
  --name ripplecore-redis \
  --restart unless-stopped \
  -p 6379:6379 \
  -v redis-data:/data \
  redis:7-alpine redis-server \
  --maxmemory 768mb \
  --maxmemory-policy allkeys-lru \
  --appendonly yes \
  --appendfsync everysec

# Restart Redis to apply
docker restart ripplecore-redis

Cache Strategy Optimization

Implement Tiered TTLs (already in packages/redis/index.ts):

// Short-lived: Frequently changing data
CacheTTL.SHORT = 60; // 1 minute
// Use for: User online status, real-time counters

// Medium: Standard application data
CacheTTL.MEDIUM = 300; // 5 minutes
// Use for: Evidence lists, analytics data

// Long: Rarely changing data
CacheTTL.LONG = 3600; // 1 hour
// Use for: Organization settings, license info

// Session: User sessions
CacheTTL.SESSION = 28800; // 8 hours (PRD requirement)

Cache Warming (Preload critical data):

// Warm cache on application startup
async function warmCache() {
  const companies = await getAllCompanies();

  for (const company of companies) {
    // Preload frequently accessed data
    await getKindnessStats(company.id); // Caches result
    await getLicenseInfo(company.id); // Caches result
  }

  console.log(`Cache warmed for ${companies.length} companies`);
}

// Call on startup
warmCache();

Cache Invalidation Strategy:

// Invalidate specific keys on mutation
export async function createKindness(data) {
  const result = await db.insert(kindness).values(data).returning();

  // Invalidate company-specific caches
  await redis.del(CacheKeys.company.kindness(data.companyId));
  await redis.del(CacheKeys.company.stats(data.companyId));

  return result;
}

// Pattern-based invalidation for complex scenarios
export async function invalidateCompanyCache(companyId: string) {
  // Delete all keys matching pattern
  await deleteKeysByPattern(`company:${companyId}:*`);
}

Redis Monitoring

Monitor Cache Performance:

# Cache hit rate (target: >80%)
docker exec ripplecore-redis redis-cli INFO stats | grep keyspace_hits
docker exec ripplecore-redis redis-cli INFO stats | grep keyspace_misses

# Calculate hit rate
# Hit Rate = hits / (hits + misses) * 100

# Memory usage
docker exec ripplecore-redis redis-cli INFO memory | grep used_memory_human
docker exec ripplecore-redis redis-cli INFO memory | grep maxmemory_human

# Evicted keys (should be low if maxmemory set correctly)
docker exec ripplecore-redis redis-cli INFO stats | grep evicted_keys

# Slow commands (>10ms)
docker exec ripplecore-redis redis-cli SLOWLOG GET 10

Optimize Slow Operations:

// ❌ BAD: Large data structures in Redis
await redis.set("company:123:all-kindness", JSON.stringify(allKindness)); // 10MB+

// ✅ GOOD: Store only IDs, fetch from DB when needed
await redis.set("company:123:kindness-ids", JSON.stringify(kindnessIds)); // <100KB

// ✅ BETTER: Use Redis data structures
await redis.sadd("company:123:kindness-ids", ...kindnessIds); // Set operations

Application Layer Optimization

Next.js Performance Tuning

Enable Production Optimizations:

File: next.config.ts

import type { NextConfig } from "next";

const config: NextConfig = {
  // Strict mode for better error catching
  reactStrictMode: true,

  // Compression (Gzip/Brotli)
  compress: true,

  // Production source maps (smaller, no original code)
  productionBrowserSourceMaps: false,

  // SWC minification (faster than Terser)
  swcMinify: true,

  // Image optimization
  images: {
    formats: ["image/avif", "image/webp"],
    deviceSizes: [640, 750, 828, 1080, 1200, 1920, 2048, 3840],
    imageSizes: [16, 32, 48, 64, 96, 128, 256, 384],
    minimumCacheTTL: 60, // Cache images for 60 seconds
  },

  // Experimental features for performance
  experimental: {
    optimizeCss: true, // Optimize CSS bundle
    optimizePackageImports: [
      "@repo/design-system", // Tree-shake design system
    ],
  },

  // Bundle analyzer (dev only)
  ...(process.env.ANALYZE === "true" && {
    webpack: (config) => {
      const { BundleAnalyzerPlugin } = require("webpack-bundle-analyzer");
      config.plugins.push(
        new BundleAnalyzerPlugin({
          analyzerMode: "static",
          openAnalyzer: false,
        }),
      );
      return config;
    },
  }),
};

export default config;

Analyze Bundle Size:

# Install analyzer
pnpm add -D webpack-bundle-analyzer

# Run analysis
ANALYZE=true pnpm build

# Open report
open apps/app/.next/analyze/client.html

Reduce Bundle Size:

// ❌ BAD: Import entire library
import _ from "lodash";
const result = _.chunk(array, 2);

// ✅ GOOD: Import only needed function
import chunk from "lodash/chunk";
const result = chunk(array, 2);

// ❌ BAD: Import all icons
import * as Icons from "lucide-react";

// ✅ GOOD: Import specific icons
import { User, Settings } from "lucide-react";

API Route Optimization

Implement Response Caching:

// apps/api/app/api/kindness/route.ts
import { NextResponse } from "next/server";
import { redis, CacheKeys, CacheTTL } from "@repo/redis";

export async function GET(request: Request) {
  const { searchParams } = new URL(request.url);
  const companyId = searchParams.get("companyId");

  // Check cache first
  const cacheKey = CacheKeys.company.kindness(companyId);
  const cached = await redis.get(cacheKey);

  if (cached) {
    return NextResponse.json(
      { data: JSON.parse(cached), cached: true },
      {
        headers: {
          "Cache-Control": "private, max-age=300", // Browser cache 5 min
          "X-Cache": "HIT",
        },
      },
    );
  }

  // Fetch from database
  const data = await listKindness({ companyId });

  // Cache for 5 minutes
  await redis.setex(cacheKey, CacheTTL.MEDIUM, JSON.stringify(data));

  return NextResponse.json(
    { data, cached: false },
    {
      headers: {
        "Cache-Control": "private, max-age=300",
        "X-Cache": "MISS",
      },
    },
  );
}

Enable HTTP/2 Server Push (via Traefik):

# Traefik configuration (in Dokploy or standalone)
http:
  middlewares:
    compress:
      compress: {}

  routers:
    app-router:
      rule: "Host(`app.your-domain.com`)"
      service: app-service
      middlewares:
        - compress # Gzip/Brotli compression

React Component Optimization

Use React.memo for Expensive Components:

// ❌ BAD: Re-renders on every parent update
function KindnessList({ items }) {
  return items.map(item => <KindnessCard key={item.id} item={item} />);
}

// ✅ GOOD: Only re-renders when props change
const KindnessCard = React.memo(function KindnessCard({ item }) {
  return <div>{item.message}</div>;
});

Virtualize Long Lists:

// For lists with 100+ items
import { useVirtualizer } from "@tanstack/react-virtual";

function KindnessList({ items }) {
  const parentRef = React.useRef<HTMLDivElement>(null);

  const virtualizer = useVirtualizer({
    count: items.length,
    getScrollElement: () => parentRef.current,
    estimateSize: () => 80,  // Estimated row height
  });

  return (
    <div ref={parentRef} style={{ height: "400px", overflow: "auto" }}>
      <div style={{ height: `${virtualizer.getTotalSize()}px`, position: "relative" }}>
        {virtualizer.getVirtualItems().map((virtualRow) => (
          <div
            key={virtualRow.index}
            style={{
              position: "absolute",
              top: 0,
              left: 0,
              width: "100%",
              height: `${virtualRow.size}px`,
              transform: `translateY(${virtualRow.start}px)`,
            }}
          >
            <KindnessCard item={items[virtualRow.index]} />
          </div>
        ))}
      </div>
    </div>
  );
}

Network & CDN Optimization

Cloudflare Integration (Optional)

Benefits:

Global CDN (faster page loads worldwide)
DDoS protection
Automatic image optimization
Faster DNS (5-minute propagation vs. hours)

Setup (Free Plan):

Add Domain to Cloudflare
- Sign up at https://cloudflare.com
- Add domain: your-domain.com
- Update nameservers at domain registrar

Configure DNS

Type: A
Name: app
Content: [Hetzner App Server IP]
Proxy status: Proxied (orange cloud)

Type: A
Name: api
Content: [Hetzner App Server IP]
Proxy status: Proxied

Type: A
Name: www
Content: [Hetzner App Server IP]
Proxy status: Proxied

SSL/TLS Settings

SSL/TLS → Overview → Full (strict)
SSL/TLS → Edge Certificates → Always Use HTTPS: On
SSL/TLS → Edge Certificates → Automatic HTTPS Rewrites: On

Performance Settings

Speed → Optimization → Auto Minify: HTML, CSS, JS
Speed → Optimization → Brotli: On
Speed → Optimization → Rocket Loader: Off (can break React)

Caching Rules

Caching → Configuration → Browser Cache TTL: 4 hours
Caching → Configuration → Crawler Hints: On

Page Rules:
- app.your-domain.com/api/* → Cache Level: Bypass
- app.your-domain.com/_next/static/* → Cache Level: Cache Everything, Edge TTL: 1 year

Expected Improvements:

30-50% faster page loads (global users)
40% bandwidth savings (compression + caching)
DDoS protection included

Image Optimization

Use Next.js Image Component:

// ❌ BAD: Regular img tag
<img src="/images/logo.png" alt="Logo" width="200" height="100" />

// ✅ GOOD: Next.js Image (automatic optimization)
import Image from "next/image";

<Image
  src="/images/logo.png"
  alt="Logo"
  width={200}
  height={100}
  priority  // Load above-the-fold images immediately
/>

// ✅ BETTER: Remote images with loader
<Image
  src="https://cdn.your-domain.com/logo.png"
  alt="Logo"
  width={200}
  height={100}
  loader={({ src, width, quality }) => {
    return `${src}?w=${width}&q=${quality || 75}`;
  }}
/>

Image Formats (next.config.ts already configured):

Serve AVIF for modern browsers (30% smaller than WebP)
Fallback to WebP for older browsers
Fallback to JPEG/PNG for ancient browsers

Server Resource Tuning

CPU Optimization

Monitor CPU Usage:

# Real-time CPU usage
top -bn1 | head -20

# Per-container CPU usage
docker stats --no-stream

# Netdata dashboard
curl http://localhost:19999/api/v1/data?chart=system.cpu

CPU Affinity for Critical Containers:

# Pin PostgreSQL to specific CPU cores (reduce context switching)
docker update --cpuset-cpus="0,1" ripplecore-postgres

# Pin Redis to different cores
docker update --cpuset-cpus="2,3" ripplecore-redis

When to Upgrade:

Sustained CPU >70% for 24+ hours
Frequent CPU throttling (check dmesg | grep throttle)
Response times degrading despite other optimizations

Vertical Scaling Path:

CPX32 (4 vCPU) → CPX42 (8 vCPU) → CPX52 (16 vCPU)
€11.99/mo       → €26.99/mo      → €61.99/mo

Memory Optimization

Monitor Memory Usage:

# System memory
free -h

# Container memory
docker stats --no-stream --format "table {{.Container}}\t{{.MemUsage}}\t{{.MemPerc}}"

# PostgreSQL memory
docker exec ripplecore-postgres psql -U ripplecore -c "SELECT pg_size_pretty(pg_database_size('ripplecore'));"

Memory Limits for Containers:

# Set memory limits (prevent OOM killer)
docker update --memory="2g" --memory-reservation="1.5g" ripplecore-postgres
docker update --memory="1g" --memory-reservation="768m" ripplecore-redis
docker update --memory="3g" --memory-reservation="2g" ripplecore-app

Swap Configuration (avoid in production if possible):

# Check swap usage
swapon --show

# Reduce swappiness (use swap only when RAM nearly full)
echo "vm.swappiness=10" >> /etc/sysctl.conf
sysctl -p

Disk I/O Optimization

Monitor Disk Performance:

# Disk I/O stats
iostat -x 1 5

# Per-process disk usage
iotop -o

# Container disk I/O
docker stats --no-stream --format "table {{.Container}}\t{{.BlockIO}}"

Optimize PostgreSQL I/O:

-- Check table bloat (wasted space)
SELECT
  schemaname,
  tablename,
  pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) AS size,
  n_dead_tup,
  last_vacuum,
  last_autovacuum
FROM pg_stat_user_tables
ORDER BY pg_total_relation_size(schemaname||'.'||tablename) DESC;

-- Manual VACUUM to reclaim space
VACUUM ANALYZE kindness;

-- Full VACUUM (requires table lock - off-peak hours only)
VACUUM FULL kindness;

SSD TRIM (Hetzner NVMe SSDs):

# Enable periodic TRIM
systemctl enable fstrim.timer
systemctl start fstrim.timer

# Manual TRIM
fstrim -v /

Monitoring & Profiling

Application Performance Monitoring (APM)

Sentry Performance Monitoring (already configured):

// Verify sample rate for production
Sentry.init({
  dsn: process.env.SENTRY_DSN,
  tracesSampleRate: 0.1, // Sample 10% of transactions

  // Track slow database queries
  integrations: [new Sentry.Integrations.Postgres()],
});

Key Metrics to Monitor:

Transaction duration (API response times)
Database query performance
Slow endpoints (>1s)
Error rates by endpoint

Database Query Profiling

Enable pg_stat_statements:

CREATE EXTENSION IF NOT EXISTS pg_stat_statements;

-- Top 10 slowest queries (by total time)
SELECT
  substring(query, 1, 100) AS short_query,
  calls,
  total_exec_time,
  mean_exec_time,
  max_exec_time
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;

-- Reset statistics (after optimization)
SELECT pg_stat_statements_reset();

Performance Testing

Load Testing with k6 (advanced):

// load-test.js
import http from "k6/http";
import { check, sleep } from "k6";

export const options = {
  stages: [
    { duration: "2m", target: 100 }, // Ramp up to 100 users
    { duration: "5m", target: 100 }, // Stay at 100 users
    { duration: "2m", target: 0 }, // Ramp down to 0
  ],
  thresholds: {
    http_req_duration: ["p(95)&lt;200"], // 95% of requests &lt; 200ms
    http_req_failed: ["rate<0.01"], // Error rate < 1%
  },
};

export default function () {
  const res = http.get("https://app.your-domain.com/api/health");

  check(res, {
    "status is 200": (r) => r.status === 200,
    "response time &lt; 200ms": (r) => r.timings.duration &lt; 200,
  });

  sleep(1);
}

Run Load Test:

# Install k6
sudo apt install k6

# Run test
k6 run load-test.js

# Expected output:
# ✓ 95% requests < 200ms
# ✓ Error rate < 1%

Performance Checklist

Quick Wins (1-2 hours)

Enable Redis caching for frequently accessed data
Add database indexes on filtered columns
Implement PgBouncer connection pooling
Enable Gzip/Brotli compression (Traefik)
Set appropriate cache headers on static assets
Use Next.js Image component for all images

Medium Effort (4-8 hours)

Tune PostgreSQL configuration for server RAM
Optimize slow queries identified by pg_stat_statements
Implement cache warming on application startup
Add Cloudflare CDN for global performance
Virtualize long lists (100+ items)
Set Docker container memory limits

Long-Term (Ongoing)

Monitor and optimize cache hit rate (target >80%)
Regular VACUUM of PostgreSQL tables
Quarterly performance load testing
Bundle size monitoring and optimization
Database query performance trending

Performance Targets Summary

Metric	Target	Measurement
API Response Time (p95)	<200ms	Sentry APM, k6 load tests
Health Endpoint	<100ms	UptimeRobot, curl timing
Page Load (FCP)	<1s	Lighthouse, WebPageTest
Database Queries (simple)	<50ms	pg_stat_statements
Database Queries (complex)	<200ms	pg_stat_statements
Cache Hit Rate	>80%	Redis INFO stats
Uptime	99.5%	UptimeRobot dashboard
CPU Usage	<70% sustained	Netdata, `top`
RAM Usage	<80% sustained	Netdata, `free -h`
Error Rate	<0.1%	Sentry error tracking

Document Version: 1.0 Last Updated: 2025-01-23 Review Cycle: Quarterly or after performance degradation