Skip to main content

Chaos faults for Cloud Foundry

Last updated on

Introduction

Cloud Foundry chaos faults disrupt the running state of an app, the JVM inside an app instance, or the network that connects an app to its dependencies. Use them to validate platform self-healing (CF restarting crashed instances), application resilience (timeouts, retries, fallbacks), and downstream behavior (consumer error rates, alert fidelity) on TAS, PCF, or open-source Cloud Foundry deployments.

All Cloud Foundry faults run from a Linux chaos infrastructure (LCI) that authenticates to the Cloud Foundry API and, for JVM and network faults, to the BOSH director that manages the Diego cells.

Go to Cloud Foundry chaos deployment to read the supported LCI deployment models, and Cloud Foundry permissions to set up the required CF and BOSH credentials.

[object Object]

CF app stop

Back to top

CF app stop stops a Cloud Foundry app for a configurable duration and then re-starts it. Use it to validate how the platform, routers, and downstream consumers behave when an app goes offline cleanly.

Use cases
  • Validate consumer fallbacks and retries when the app returns 5xx from the CF router.
  • Confirm CF restarts the app and reports it healthy after duration elapses.
  • Tune alert thresholds around route-level health checks.
View details
[object Object]

CF app container kill

Back to top

CF app container kill terminates the container of one or more app instances and lets Cloud Foundry reschedule them. Use it to validate platform self-healing and peer absorption.

Use cases
  • Confirm peer instances absorb traffic while the killed instance is rescheduled.
  • Validate Diego restarts the instance inside its expected window.
  • Verify in-flight requests fail cleanly so callers retry.
View details
[object Object]

CF app route unmap

Back to top

CF app route unmap detaches a specific route from an app for a configurable duration, then re-maps it. The app keeps running; only its inbound route is disrupted.

Use cases
  • Validate gateway and consumer behavior when a route returns 404 from the CF router.
  • Confirm secondary routes mapped to the same app continue serving.
  • Practice runbooks for accidental route removal.
View details
[object Object]

CF app JVM CPU stress

Back to top

CF app JVM CPU stress drives high CPU usage inside the JVM of a Java app instance for a configurable duration. Use it to test the app and autoscaler under sustained CPU pressure.

Use cases
  • Measure latency under CPU saturation.
  • Validate autoscaling rules trigger and pull traffic away from the stressed instance.
  • Surface thread-pool and concurrency bugs.
View details
[object Object]

CF app JVM memory stress

Back to top

CF app JVM memory stress drives sustained heap or non-heap memory pressure inside the JVM of a Java app instance. Use it to test GC behavior and OutOfMemoryError handling.

Use cases
  • Surface long GC pauses under heap pressure.
  • Detect runaway metaspace consumption with non-heap pressure.
  • Validate memory-based autoscaling.
View details
[object Object]

CF app JVM trigger GC

Back to top

CF app JVM trigger GC forces a full garbage-collection cycle inside the JVM of a Java app instance. Use it to measure stop-the-world pause times and validate that health checks tolerate them.

Use cases
  • Quantify worst-case GC pause time.
  • Confirm liveness/readiness probes do not falsely fail during a full GC.
  • Measure the tail-latency cost of a GC under production-shaped traffic.
View details
[object Object]

CF app JVM method exception

Back to top

CF app JVM method exception forces a specific JVM method to throw a configurable exception. Use it to validate error-handling paths, retry budgets, and circuit-breaker behavior.

Use cases
  • Confirm catch blocks map the exception to the right user-visible response.
  • Validate retry and circuit-breaker thresholds.
  • Test observability tags for the configured exception class.
View details
[object Object]

CF app JVM method latency

Back to top

CF app JVM method latency adds artificial delay to every invocation of a specific JVM method. Use it to simulate a slow downstream call at the method boundary.

Use cases
  • Tune timeouts to trip before the user-visible SLO is breached.
  • Quantify the latency contribution of a single slow method.
  • Validate caller retry budgets do not amplify the slowdown.
View details
[object Object]

CF app JVM modify return

Back to top

CF app JVM modify return overrides the return value of a specific JVM method. Use it to test defensive checks and fallbacks against unexpected values like null or wrong-type returns.

Use cases
  • Validate null-safety in callers of a non-null method.
  • Force a feature flag off by overriding its accessor.
  • Simulate a poisoned cache layer.
View details
[object Object]

CF app network latency

Back to top

CF app network latency adds a configurable amount of latency (with optional jitter) on the egress traffic of an app instance. Use it to simulate a slow downstream dependency at the network layer.

Use cases
  • Simulate a slow database or third-party API.
  • Quantify how added round-trip time affects user-visible P99.
  • Approximate cross-region latency to test caching decisions.
View details
[object Object]

CF app network loss

Back to top

CF app network loss drops a configurable percentage of egress packets from an app instance. Use it to test retransmissions, timeouts, and circuit-breaker behavior under packet loss.

Use cases
  • Validate the app's retry budget on a flaky downstream.
  • Confirm circuit breakers open at the configured loss rate.
  • Test alert tuning for elevated TCP retransmissions.
View details
[object Object]

CF app network corruption

Back to top

CF app network corruption corrupts a configurable percentage of egress packets from an app instance. Corrupted packets are discarded by the receiver, triggering retransmissions.

Use cases
  • Quantify retransmission overhead on end-to-end latency.
  • Test protocol parsers reject malformed frames cleanly.
  • Validate L4 monitoring detects elevated retransmission rates.
View details
[object Object]

CF app network duplication

Back to top

CF app network duplication duplicates a configurable percentage of egress packets from an app instance. Use it to validate idempotency assumptions and deduplication logic.

Use cases
  • Confirm duplicate HTTP requests do not cause double writes.
  • Validate deduplication keys on message-bus consumers.
  • Test that UDP receivers tolerate duplicate datagrams.
View details