Feature Flags
Overview
In a mixed version cluster (e.g. some versions are 3.11.x and some are 3.12.x) during an upgrade, some nodes will support a different set of features, behave differently in certain scenarios, and otherwise not act exactly the same: they are different versions after all.
Feature flags are a mechanism that controls what features are considered to be enabled or available on all cluster nodes. If a feature flag is enabled, so is its associated feature (or behavior). If not then all nodes in the cluster will disable the feature (behavior).
The feature flag subsystem allows RabbitMQ nodes with different versions to determine if they are compatible and then communicate together, despite having different versions and thus potentially having different feature sets or implementation details.
This subsystem was introduced to allow for rolling upgrades of cluster members without shutting down the entire cluster.
Feature flags are not meant to be used as a form of cluster configuration. After a successful rolling upgrade, users should enable all feature flags.
All feature flags become mandatory (graduate) at some point. For example, RabbitMQ 3.12 requires feature flags introduced in the 3.11 series to be enabled prior to the upgrade, RabbitMQ 3.11 graduates all 3.8 flags, and so on.
Quick summary (TL;DR)
Feature Flag Ground Rules
- A feature flag can be enabled only if all nodes in the cluster support it
- A node can join or re-join a cluster only if:
- it supports all the feature flags enabled in the cluster and
- if every other cluster member supports all the feature flags enabled on that node
- Once enabled, a feature flag cannot be disabled
For example, RabbitMQ 3.13.x and 3.12.x nodes are compatible as long as no 3.13.x-specific feature flags are enabled.
Key CLI Tool Commands
- To list feature flags:
rabbitmqctl list_feature_flags
- To enable a feature flag (or all currently disabled flags):
rabbitmqctl enable_feature_flag <all | name>
It is also possible to list and enable feature flags from the Management plugin UI, in "Admin > Feature flags".
Examples
Example 1: Compatible Nodes
- If nodes A and B are not clustered, they can be clustered.
- If nodes A and B are clustered:
- "Coffee maker" can be enabled.
- "Juicer machine" cannot be enabled because it is unsupported by node B.
Example 2: Incompatible Nodes
- If nodes A and B are not clustered, they cannot be clustered because "Juicer machine" is unsupported on node B.
- If nodes A and B are clustered and "Juicer machine" was enabled while node B was stopped, node B cannot re-join the cluster on restart.
Feature Flags and RabbitMQ Versions
As covered earlier, the feature flags subsystem's primary goal is to allow upgrades regardless of the version of cluster members, to the extent possible.
Feature flags make it possible to safely perform a rolling upgrade to the next patch or minor release, except if it is stated otherwise in the release notes. Indeed, there are some changes which cannot be implemented as feature flags.
However, note that only upgrading from one minor to the next minor or major is supported. To upgrade from e.g. 3.9.16 to 3.12.3, it is necessary to upgrade to 3.9.29 first, then to the latest 3.10 patch release, then the latest 3.11 release, then 3.12.3. After certain steps in the upgrade process it will also be necessary to enable all stable feature flags available in that version. For example, 3.12.0 is a release that requires all feature flags to be enabled before a node can be upgraded to it.
Likewise if there is one or more minor release branches between the minor version used and the next major release. That might work (i.e. there could be no incompatible changes between major releases), but this scenario is unsupported by design for the following reasons:
- Skipping minor versions is not tested in CI.
- Non-sequential releases may or may not support the same set of feature flags. Feature flags present for several minor branches can be marked as required and their associated feature/behavior is now implicitly enabled by default. The compatibility code is removed in the process, preventing clustering with older nodes. Remember their purpose is to allow upgrades, they are not a configuration mechanism.
Their is no policy defining the life cycle of a feature flag in general. E.g. there is no guaranty that a feature flag will go from "stable" to "required" after N minor releasees. Because new code builds on top of existing code, feature flags are marked as required and the compatibility code is removed whenever it is needed.