Skip to content

Incident response

When something goes wrong, here is what happens.

We monitor the service through:

  • Uptime checks: External monitoring via Updown.io, checking the web interface, Git API, and CI runner API. Alerts fire on downtime.
  • Infrastructure metrics: Scaleway observability for compute, database, and storage. Alerts on resource exhaustion, error rate spikes, and latency.
  • Log analysis: Server logs retained for 30 days. Reviewed for anomalies.
  • User reports: Email to security@codebahn.net or via in-app support chat.

Incidents are classified by severity:

Severity Definition Examples
Critical Data breach, unauthorized access to customer data, or complete service outage Cross-tenant data leak, database compromise, full outage
High Partial service degradation affecting multiple customers, or a vulnerability with high exploit potential CI runners down, Git push failures, authentication bypass
Medium Limited impact, single-customer issue, or a vulnerability requiring specific conditions Slow API responses, backup verification failure, low-severity vulnerability
Low Cosmetic, informational, or no direct user impact Logging gap, documentation error, non-exploitable misconfiguration
Severity Response time Communication
Critical Immediate (within 1 hour) Email to affected customers, status page update
High Within 4 hours Status page update, email if customer-facing
Medium Within 1 business day Status page if user-visible
Low Within 5 business days No external communication unless relevant

For incidents involving personal data, we notify affected data controllers within 48 hours per our Data Processing Agreement and GDPR Article 33.

  • Status page: status.codebahn.net for real-time service status and incident updates.
  • Email: Direct notification to affected customers for Critical and High incidents.
  • In-app: Support chat for individual follow-up.

We do not use social media for incident communication.

Every Critical and High incident gets a post-incident review within 5 business days. The review covers:

  1. Timeline: What happened and when.
  2. Root cause: Why it happened.
  3. Impact: What was affected and for how long.
  4. Response: What we did and how fast.
  5. Prevention: What changes prevent recurrence.

We publish a summary for incidents with broad user impact. The summary includes the timeline, root cause, and prevention measures. It does not include internal operational details that could aid future attacks.

If recovery from backup is needed:

  • Daily encrypted backups are available, stored on a separate provider in a separate EU region.
  • Backups are verified weekly with tested restore procedures.
  • Recovery target: restore service from backup within hours, not days.

For the full backup details, see the security overview.