RippleCore
Infrastructure

Infrastructure Overview

Complete CI/CD infrastructure specification for production deployment on Hetzner Cloud VPS

RippleCore Infrastructure Documentation

Complete CI/CD infrastructure specification for production deployment on Hetzner Cloud VPS


Overview

This documentation provides a comprehensive, production-ready CI/CD pipeline and infrastructure architecture for deploying RippleCore (and similar multi-app monorepos) on Hetzner Cloud VPS servers.

Target Audience: DevOps engineers, system administrators, technical leads Deployment Model: Service-oriented architecture on dedicated VPS servers Cloud Provider: Hetzner Cloud (Germany - EU data residency) Total Monthly Cost: €35-60/month (~$38-65 USD)


Documentation Structure

πŸ“ Core Architecture

Architecture - Infrastructure Architecture Specification

  • Complete server infrastructure design (4 VPS servers)
  • Network architecture with security groups and private networking
  • Technology stack and tool justification
  • Scaling strategy (vertical β†’ horizontal)
  • Cost analysis and 3-year growth projections
  • Time to Read: 20 minutes

πŸ”„ CI/CD Pipeline

CI/CD Pipeline - CI/CD Setup & Workflow Configuration

  • GitHub Actions workflow (testing, building, security scanning)
  • Dokploy configuration (deployment automation)
  • Environment strategy (dev/preview/staging/production)
  • Deployment workflows with health checks and auto-rollback
  • Preview environments for pull requests
  • Time to Implement: 12-16 hours (Week 2 of roadmap)

πŸ“Š Monitoring & Alerting

Monitoring - Monitoring & Alerting Setup Guide

  • Netdata setup and custom alert configuration
  • UptimeRobot configuration with 6 monitors
  • Sentry integration for error tracking
  • Alert routing matrix (Slack, email, SMS)
  • Dashboard configuration and log management
  • Time to Implement: 8-10 hours (Week 3 of roadmap)

πŸ’Ύ Backup & Recovery

Backup & Recovery - Backup & Disaster Recovery Guide

  • Automated backup system (PostgreSQL + Redis)
  • Grandfather-Father-Son retention strategy
  • Restore procedures (full, partial, selective)
  • Disaster recovery scenarios with step-by-step runbooks
  • Weekly automated backup validation
  • Time to Implement: 6-8 hours (Week 4 of roadmap)

βœ… Deployment Checklist

Deployment Checklist - Pre-Launch Verification Checklist

  • 7-phase deployment checklist (infrastructure β†’ launch)
  • Security and compliance verification
  • Performance and load testing procedures
  • User acceptance testing guidelines
  • Team readiness and sign-off procedures
  • Time to Complete: 4-6 hours (first-time setup)

Supporting Files

πŸ€– Automation Scripts

scripts/backup-db.sh - Automated PostgreSQL Backup

  • Daily backups to Hetzner Object Storage (S3-compatible)
  • Grandfather-Father-Son retention (7 days, 4 weeks, 12 months)
  • Checksum verification and compression
  • Slack notifications on success/failure
  • Deploy to: Database server /root/scripts/backup-db.sh

scripts/test-restore.sh - Weekly Backup Validation

  • Non-destructive restore testing
  • Data integrity verification
  • Record count comparison with production
  • Automated Slack notifications
  • Deploy to: Database server /root/scripts/test-restore.sh

πŸ“– Runbooks

../runbooks/disaster-recovery.mdx - Disaster Recovery Runbook

  • Emergency contact information
  • Step-by-step recovery procedures for 3 scenarios:
    1. Complete server failure (RTO: 2 hours)
    2. Database corruption (RTO: 1 hour)
    3. Accidental data deletion (RTO: 30 minutes)
  • Post-recovery checklist
  • Quarterly DR drill schedule

Advanced Guides

⚑ Performance Optimization

Performance Optimization - Production Performance Tuning

  • PostgreSQL configuration tuning (4GB RAM optimization)
  • Redis cache optimization and monitoring
  • Database query optimization and indexing strategies
  • Application layer optimization (Next.js, React)
  • Network & CDN optimization (Cloudflare integration)
  • Server resource tuning (CPU, RAM, disk I/O)
  • Performance testing and profiling
  • Target: <200ms API response times, 99.5% uptime

πŸ”’ Security Hardening

Security Hardening - Advanced Security Measures

  • Server hardening (kernel, SSH, fail2ban, Docker)
  • Network security (firewall, DDoS protection, segmentation)
  • Application security (CSP, input validation, authentication)
  • Database security (SSL/TLS, audit logging, access control)
  • Secrets management (1Password CLI, rotation)
  • Security monitoring and incident response
  • GDPR compliance checklist

☸️ Kubernetes Migration

Kubernetes Migration - Future Scaling Path

  • Migration decision framework (when to migrate)
  • Kubernetes platform options (self-managed vs. managed)
  • Application containerization for K8s
  • Kubernetes manifests (deployments, services, HPA)
  • Blue-green migration strategy
  • Post-migration optimization and cost reduction
  • Recommended: Migrate when >10 apps OR >100K users

Quick Start

For Immediate Implementation

If you're ready to deploy now, follow this sequence:

  1. Week 1 (8-12 hours): Infrastructure Foundation

    • Read: Architecture
    • Provision Hetzner servers (4x VPS)
    • Configure networking and firewall
    • Deploy PostgreSQL + Redis
    • Manual deployment of applications
  2. Week 2 (12-16 hours): CI/CD Automation

    • Read: CI/CD Pipeline
    • Setup GitHub Actions workflows
    • Configure Dokploy deployments
    • Test preview environments
  3. Week 3 (8-10 hours): Monitoring & Observability

    • Read: Monitoring
    • Install Netdata on all servers
    • Configure UptimeRobot monitors
    • Setup Slack alerting
  4. Week 4 (6-8 hours): Backup & DR

    • Read: Backup & Recovery
    • Deploy backup automation scripts
    • Test restore procedures
    • Schedule weekly validation
  5. Pre-Launch (4-6 hours): Verification

    • Complete: Deployment Checklist
    • Run verification script
    • Conduct final UAT
    • Get sign-off from stakeholders

Total Time to Production: 38-52 hours over 4-5 weeks


For Initial Assessment

If you're evaluating this approach, start here:

  1. Read Architecture (20 minutes)

  2. Review Deployment Checklist (15 minutes)

  3. Evaluate CI/CD Pipeline (20 minutes)

Total Assessment Time: ~1 hour


Key Features

βœ… Production-Ready

  • Zero-downtime deployments with rolling updates
  • Automated health checks and rollback
  • Comprehensive monitoring and alerting
  • Disaster recovery tested procedures

πŸ’° Cost-Effective

  • €35-60/month total infrastructure cost
  • 87% cheaper than equivalent AWS infrastructure
  • No vendor lock-in (easily migrate to other providers)

πŸ”’ Secure by Default

  • Private database network (no public exposure)
  • Security headers (CSP, HSTS, X-Frame-Options)
  • Rate limiting and DDoS protection
  • Automated SSL/TLS certificates

πŸ“ˆ Scalable

  • Clear vertical scaling path (resize servers)
  • Horizontal scaling strategy documented
  • Load balancer integration ready
  • Database read replicas support

πŸ›‘οΈ Resilient

  • Daily automated backups with validation
  • 2-hour recovery time objective (RTO)
  • 24-hour recovery point objective (RPO)
  • Quarterly disaster recovery drills

Technology Stack

Infrastructure Layer

  • Cloud Provider: Hetzner Cloud (EU data residency)
  • Operating System: Ubuntu 24.04 LTS
  • Container Runtime: Docker 27.x
  • Reverse Proxy: Traefik 3.x
  • Deployment Platform: Dokploy (self-hosted)

Application Layer

  • Framework: Next.js 16 + React 19
  • Database: PostgreSQL 18
  • Cache: Redis 7
  • ORM: Drizzle (type-safe)
  • Auth: better-auth

CI/CD Layer

  • CI Platform: GitHub Actions (free tier)
  • CD Platform: Dokploy (self-hosted)
  • Security Scanning: Trivy (container vulnerabilities)
  • Artifact Registry: GitHub Container Registry

Monitoring Layer

  • Infrastructure: Netdata (real-time metrics)
  • Uptime: UptimeRobot (health checks)
  • Errors: Sentry (application errors)
  • Logs: Docker logs with rotation

Infrastructure Costs

Monthly Breakdown

ComponentSpecificationMonthlyAnnual
Production AppCPX32 (4 vCPU, 8GB)€11.99€143.88
Production DBCPX22 (3 vCPU, 4GB)€8.49€101.88
CI/CD ServerCPX11 (2 vCPU, 2GB)€4.15€49.80
Staging ServerCPX22 (3 vCPU, 4GB)€8.49€101.88
Object Storage50GB backups€0.25€3.00
Floating IPs2x static IPs€2.34€28.08

TOTAL: €35.71 per month, €428.52 per year

External Services (optional):

  • Netdata Cloud: Free (less than 5 nodes)
  • UptimeRobot: Free (50 monitors)
  • Sentry: Free tier or €26/mo
  • GitHub Actions: Free (2,000 min/mo)

Grand Total: €35-60/month depending on usage


Performance Targets

Response Time SLAs

  • Health Endpoints: <100ms
  • API Endpoints: <200ms (PRD requirement)
  • Page Load Time: <1s

Availability SLAs

  • Uptime: 99.5% (3.6 hours/month downtime acceptable)
  • RTO: 2 hours (complete recovery)
  • RPO: 24 hours (daily backups)

Capacity

  • Concurrent Users: 1K-50K (medium scale)
  • Applications: 3-10 apps
  • Database Size: Up to 100GB (can scale)

Support & Maintenance

Regular Maintenance Tasks

Daily (automated):

  • Database backups (3 AM UTC)
  • Backup verification (6 AM UTC)
  • Health check monitoring (continuous)

Weekly (automated):

  • Backup restore testing (Sundays 4 AM)
  • Dokploy configuration backup (Sundays 5 AM)
  • Security updates review

Monthly (manual):

  • Review monitoring dashboards
  • Analyze error trends (Sentry)
  • Review backup storage costs
  • Update documentation

Quarterly (manual):

  • Disaster recovery drill
  • Security audit
  • Performance benchmarking
  • Cost optimization review

Troubleshooting

Common Issues

Issue: Deployment Failing

Issue: High CPU/RAM Usage

  • Check: Monitoring - Dashboards
  • Analyze: Netdata metrics, identify bottleneck
  • Action: Vertical scaling or optimization

Issue: Backup Failures

Issue: Database Connectivity


Migration Guides

Migrating from Existing Infrastructure

From Vercel + PlanetScale/Supabase:

  1. Export database from PlanetScale/Supabase
  2. Import to PostgreSQL on Hetzner
  3. Update DATABASE_URL in applications
  4. Deploy applications to Dokploy
  5. Update DNS to point to Hetzner servers
  6. Verify functionality, then decommission old infrastructure

From AWS/GCP/Azure:

  1. Provision Hetzner infrastructure (parallel to existing)
  2. Setup replication from existing DB to Hetzner DB
  3. Deploy applications to Dokploy (blue-green deployment)
  4. Switch DNS to Hetzner (with rollback plan)
  5. Monitor for 48 hours, then decommission old infrastructure

Estimated Migration Time: 1-2 weeks depending on data size


Future Enhancements

Potential Improvements (Not Required for MVP)

Infrastructure:

  • Multi-region deployment (EU + US for lower latency)
  • Kubernetes migration (for >10 apps, >100K users)
  • Managed database (Aiven, Neon) for hands-off scaling
  • CDN integration (Cloudflare, Bunny CDN)

CI/CD:

  • Canary deployments (gradual rollout)
  • Feature flags integration (LaunchDarkly, Flagsmith)
  • Automated performance regression testing
  • Blue-green deployment strategy

Monitoring:

  • Grafana + Loki for advanced log querying
  • Prometheus for custom metrics
  • Distributed tracing (Jaeger, Tempo)
  • APM integration (DataDog, New Relic)

Backup & DR:

  • Point-in-time recovery (WAL archiving)
  • Encrypted backups (GPG)
  • Multi-region backup replication
  • Hourly backups (reduce RPO to 1 hour)

Contributing to Documentation

Document Maintenance:

  • Review after each major deployment
  • Update after infrastructure changes
  • Incorporate lessons learned from incidents
  • Keep examples and commands current

Improvement Process:

  1. Identify gaps or outdated information
  2. Create PR with proposed changes
  3. Review with team
  4. Update changelog
  5. Notify team of changes

Contact & Support

Documentation Owner: DevOps Team Last Major Update: 2025-01-23 Next Review: Quarterly or after major changes

Internal Support:

External Support:


Changelog

Version 1.1 (2025-01-23):

  • Added Performance Optimization Guide (database tuning, caching, CDN)
  • Added Security Hardening Guide (advanced security measures, GDPR compliance)
  • Added Kubernetes Migration Guide (future scaling path with K8s)
  • Enhanced README with advanced guides section

Version 1.0 (2025-01-23):

  • Initial documentation release
  • Complete infrastructure specification
  • CI/CD pipeline implementation guide
  • Monitoring and alerting setup
  • Backup and disaster recovery procedures
  • Deployment checklist and verification
  • Automation scripts (backup, restore testing)

Next Version (TBD):

  • Multi-region deployment guide (EU + US datacenters)
  • Cost optimization case studies with real metrics
  • Advanced monitoring with Grafana + Loki
  • Service mesh implementation (Istio/Linkerd)

Happy Deploying! πŸš€

For questions or feedback, reach out to the DevOps team.