Configuration

High Availability Configuration

All high availability configuration is wrapped in the HA section which can be added to any JSLEE service:

{
    "ha": {
      "circuit-breakers": [
      ],
      "routes": [
      ]
    }
}

Circuit Breaker Templates

{
  "name": "template name",
  "failures-before-open": 5,
  "half-open-delay-ms": 10000,
  "failure-count-rolling-window-ms": 10000,
  "maximum-retries": 0,
  "retry-delay-ms": [],
  "on-failure": {
  }
}

Each circuit breaker template is configured with the following options:

Field Type Required? Default Description
name String Yes - The name of the circuit breaker. Required for routers to be able to refer to the circuit breaker template to use for a route
failures-before-open Integer No 5 The number of failures to allow within a rolling window before a circuit breaker is set to the OPEN state.
half-open-delay-ms Integer No 30000 The number of milliseconds after entering the OPEN state to enter the HALF_OPEN state.
failure-count-rolling-window-ms Integer No 10000 the number of milliseconds to use as the rolling window to count failures in. Failures prior this many milliseconds in the past are ignored.
maximum-retries Integer No 0 The number of times to retry a request before considering the request failed. The default is 0, which tells the circuit breaker to not retry at all, immediately considering any message that fails with a temporary error to be failed.
retry-delay-ms Varies No null Refer to Retry Delay.
on-failure Object No (empty) Refer to On Failure.

If the maximum-retries is set to 0, then retry-delay-ms is not relevant as no retries are attempted. Additionally, if retry-delay-ms is not configured, then maximum-retries is also ignored. If on-failure is not configured, no failure handling will occur, and all failures are immediately propagated back to the source of the message within the service.

Note that all configuration options can be overridden on a per-route basis.

Retry Delay

Retry delay configuration may be undefined:

{
}

If not configured, no retries will be attempted, regardless of the maximum-retries value.

Retry delay may be defined as a single number:

{
  "retry-delay-ms": 250
}

If defined as a single number, the maximum-retries count indicates the number of retries, and the configured delay is the milliseconds between each attempt.

Retry delay may alternately be configured as an array of numbers:

{
  "retry-delay-ms": [ 50, 150, 250, 500, 1000 ]
}

When the retry-delay is given as a list of numbers then the maximum-retries indicates the number of retries to attempt. In this case the millisecond delay before the message is retried is given by the index into the array of the attempt, so the first retry is at first position in the array, the second retry at the second position, etc.

If maximum-retries is not given but retry-delay-ms is given as an array, the maximum-retries is set to the length of the array. Note that in this configuration any retry attempt that would try to select a delay off the end of the array instead will choose the last value in the array.

On Failure

A distribution list can be configured for when a message send failure occurs. This distribution list may be empty:

{
  "on-failure": {
  }
}

In such a configuration, any failure is not reattempted in any way, and the failure is propagated back to the source of the message.

The distribution list may be given as a single value in the distribute-to key:

{
  "on-failure": {
    "distribute-to": "any:redis"
  }
}

In this configuration, the distribution string indicates a destination to redirect a failed message to. For more information on the destination format and meaning, see Distribution Addresses.

The distribution list may also be given as an array:

{
  "on-failure": {
    "distribute-to": ["node1:redis", "node2:redis"]
  }
}

In this configuration, each request to distribute a message on failure selects the next destination in a round-robin fashion. The round-robin selection is done for all messages to be distributed by the circuit breaker on failure, and the round-robin selection is not reset on every message.

Routing Configuration

High availability routing is configured as an array of routes. Routes are attempted in the order configured. Ensure that routing is configured in such a way that the most specific route is defined first.

Field Type Required Default Description
match-address String Yes A regular expression to match against the destination address of a message. If a message’s full destination address matches this regular expression, the route is matched. Note that this match is done before considering actually available JSLEE addresses, so the service can send messages which match only against a locally defined route.
circuit-breaker Varies No See Circuit Breaker Route Configuration.
distribute-to Varies No See Distribution Addresses.

Circuit Breaker Route Configuration

A circuit breaker may be defined as a string within a route:

{
  "circuit-breaker": "mycb"
}

If configured in this way, the circuit breaker template with the given name is used as-is with no overriden configuration. Note that unique instances of the circuit breaker will be instantiated for each destination the route ultimately sends messages to.

{
  "circuit-breaker": {
    "name": "mycb"
  }
}

If configured in this way, the circuit breaker template with the given name is used with any fields included in the route’s circuit-breaker section overriding the template’s values (usually the distribute-to field).

Distribution Addresses

Circuit breaker templates may directly configure a distribute-to field, and routes may be configured to override their circuit breaker template’s value for this field to have per-route distribute-to configuration.

The destination(s) defined in the distribute-to field do not need to be complete JSLEE event bus addresses. They instead may be templates for the real destination address, and references to other routes configured on the same service.

The following example shows how the route match and distribution configuration can support passing a message through multiple layers of circuit-breakers within the same service to achieve multiple different types of failure handling. In this example, assume that all messages being sent by the containing service are sent to any:redis-service/<endpoint>, where <endpoint> can be any value, and many distinct values.

{
    "ha": {
      "routing": [
        {
          "match-address": ".*redis-service.*",
          "distribute-to": "cluster-redis",
          "circuit-breaker": {
            "name": "redis-submission",
            "on-failure": {
              "distribute-to": "backup-redis"
            }
          }
        },
        {
          "match-address": ".*backup-redis.*",
          "distribute-to": "local:backup-redis"
        }
      ],
      "circuit-breakers": [
        {
          "name": "redis-submission",
          "failures-before-open": 3,
          "half-open-delay-ms": 10000,
          "retry-delay-ms": [
            50,
            250,
            500
          ]
        }
      ]
    }
}

In this configuration:

  1. all messages that match the string .*redis-service.* when their JSLEE event bus address is fully specified will match the first route.
  2. all messages that match the string .*backup-redis.* when their JSLEE event bus address is fully specified will match the second route.
  3. Messages that match the first route will be sent to the service cluster-redis. Note however that:
    1. The message’s original destination address will retain the endpoint and local vs. any components of that destination address, and so if a message is being sent to any:redis-service/queue1, then the resulting address will be any:cluster-redis/queue1.
    2. distribute-to can override any with local (as is done in the second route), and can override the endpoint by explicitly specifying the endpoint.
    3. To keep the service, but overwrite the endpoint, specify the service using _, e.g. _/endpointa.
  4. The configured circuit breaker redis-submission has a retry delay specified, and no maximum-retries so the maximum-retries is set to 3 (the length of the retry-delay-ms array).
  5. On failure, the first route will distribute messages to backup-redis - retaining the endpoint name and local vs. any of the original message. Route selection will be re-run, ensuring that the second route is selected.
  6. The second route only configures a distribution list, no circuit breaker. Since there is no circuit breaker configured, any failure will cause an immediate propogation back to the message source (the first route). Since the first route has already failed the message, the failure propagates back to the original source of the address in the service.

Default Configuration

If no HA configuration is applied to a service at all (i.e. the ha section is not defined), the platform will automatically apply a “prefer-local-address” HA configuration to all messages that service sends. This default configuration is equivalent to the service having the following HA configuration defined:

    "ha": {
      "circuit-breakers": [
        {
          "name": "prefer_local",
          "failures-before-open": 1,                // Only a single failure permitted...
          "half-open-delay-ms": 300000              // ... within 5 minutes.
        }
      ],
      "routing": [
        {
          "match-address": "^any:.*",               // Trigger on normal traffic to any service...
          "distribute-to": "local:_",               // ... then try to send it locally first...
          "circuit-breaker": {
            "name": "prefer_local",
            "on-failure": {
              "distribute-to": "any:_"              // ... and on failover, send to the original address.
            }
          }
        }
      ]
    }

This default configuration monitors traffic to any service and tries to send it locally first with a 1-failure, 5-minute circuit breaker configuration. If the local delivery does not succeed, the original cluster-wide address is used instead.

If any HA configuration is listed for a service, this default configuration does not apply and the user-defined configuration is used exclusively.