HOME BLOG ARCHIVE TAGS

HSM Synchronization: DINAMO Replication #3

December 31, 2013

In the last installment, DINAMO’s overall distributed architecture was presented, from a CAP Theorem perspective. We introduced its Synchronous Multi-master Replication Layer (SMMRL), which provides built-in consistency guarantees (instead of eventual ones, typically found on systems where synchronization is implemented solely in client communication libraries).

HOW DINAMOs TALK TO EACH OTHER

Before thinking about consensus among HSMs, there must be a way to make communication possible between them. In SMMRL, two approaches were chosen for pool setup: a) domain grouping, where automatic peer/node discovery is possible (leveraging SLP), and b) manual configuration, offered as a fallback to networks that don’t support multicasting and/or broadcasting.

After configuration, all nodes “know” each other through their Node-Lists (NL). For message exchanging between servers, TLS with AES-128/SHA1 is employed (it’s understood, has known security features/limitations, and is approved under FIPS 140-2; see its Implementation Guidance, sections D.3 and D.8). Finally, for communication to take place, the peers must share a common set of keys, and must be running in the same mode (FIPS/non-FIPS).

SMMRL DISTRIBUTED OPERATION

Each transaction under SMMRL (or RL, from now on) is assigned a number (TID). It’s a big pseudo random 64-bit integer (for practical reasons, not as big as traditional 128-bit GUIDs). HSMs are supposed to deliver strong DRBGs/PRNGs, so the size is not an expected issue (by the way, DINAMO doesn’t use Dual_EC_DRBG).

Persistent storage operations are traditionally described as CRUD; for RL, {C;U;D} generate a distributed transaction log (TL). It carries a TID, a Synchronization-Point (or Sync-Point/SP - another 64-bit integer, that will be discussed later), current NL snapshot (CNLS), and data belonging to the actual operation (e.g., key creation material).

TLs are broadcasted. Their replay is called transportation phase (TP) to DINAMO’s storage and/or security layers. No distributed transaction succeeds if the peers involved (the ones at CNLS) don’t share a common Sync-Point (making the SP part of TL exchange yields fast consistency checks). After TP, every HSM in a pool reaches the same SP.

SMMRL TL BROADCASTING

DINAMO storage layer is stable, providing ACID guarantees for RL’s TL broadcasting (TLB). It’s done through the well established two-phase commit protocol (2PC), widely used in the database industry. 2PC has some drawbacks, but is popular in practice, despite of them: it’s simple, quite efficient, and delivers a correct solution to consensus problem. Unfortunately, it’s a blocking protocol.

For adverse situations (like crashes confirmed by sys-admins), there exists a termination/resolution protocol, able to free a pool/transaction from an HSM. It’s triggered by explicit ‘node-down’ administrative operations.

SYNC-POINTS

Striving to be a CP system, RL was designed to detect and avoid split-brain scenarios. In that sense, the Sync-Point (SP) was devised as a way to encode all HSM state in a single number. With one check at TLB level, DINAMOs can tell if their peers share a consistent state.

Hash values are good candidates for data fingerprints at specific points in time. But this “deep” mechanism doesn’t scale in many ways. There’s the overhead of strong hashing calculation, and data representation/versioning could couple SPs and firmwares.

As a lightweight alternative, a “shallow” approach took advantage of TIDs. Instead of a fingerprint, SP was reinterpreted as the chaining of TL replays (i.e., SP represents the “sum” of all TPs). Final result is efficient and elegant: after TP, current SP is derived from the one-time-pad of previous SP with transported TL id (e.g., SPn = SPn -1 XOR TIDn). SP0 is always 0x00 (OEM/reset state).

LIVE-SYNCS

To be really serviceable, new node additions to a running DINAMO group must happen without downtimes. There’s an RL operation called Live-Sync (LS) for this purpose, that brings a node state up to the current pool SP.

After synchronization, pre-existing peers have their NLs “sensibilized” as part of the procedure, and start to work with the new HSM. LS is as simple as the push of a button on the local console/shell.