{"uuid": "02dac93d-0971-4c8d-9a6c-b37b78757645", "vulnerability_lookup_origin": "1a89b78e-f703-45f3-bb86-59eb712668bd", "author": "9f56dd64-161d-43a6-b9c3-555944290a09", "vulnerability": "CVE-2024-51504", "type": "seen", "source": "https://gist.github.com/ppkarwasz/f5be1b5c0182fe665252101c5f24d39f", "content": "# Apache ZooKeeper \u2014 Threat Model\n\n&gt; Produced with the `threat-model-producer` skill\n&gt; ().\n&gt; This is the implicit contract between ZooKeeper and its downstream\n&gt; operators/integrators: what is in scope, what is out, what the project\n&gt; claims, and what it disclaims. It is **not** an audit, a CVE list, or a\n&gt; secure-coding guide.\n\n## \u00a71 Header\n\n- **Project**: Apache ZooKeeper (server + Java client + Jute serialization).\n- **Version / commit**: `3.10.0-SNAPSHOT`, commit `9b535738d` (branch `fix/4820_change_logback_scope`).\n- **Date**: 2026-06-18.\n- **Author(s)**: Draft generated for maintainer review. Not yet ratified.\n- **Status**: **DRAFT \u2014 not yet reviewed by the ZooKeeper PMC.** Heavily `(inferred)`; treat \u00a78/\u00a79 boundaries as proposals until confirmed.\n- **Version binding**: A report against ZooKeeper version *N* is triaged against this model as it stood at *N*, not at HEAD. Defaults change between minor releases (e.g. `fips-mode` flipped default between 3.8.x and 3.9.0); always read the row for the reported version.\n- **Reporting cross-reference**: Findings that violate a \u00a78 claimed property should be reported privately to `security@zookeeper.apache.org` per  *(documented)*. Findings that fall under \u00a73 (out of scope) or \u00a79 (disclaimed) will be closed citing this document.\n- **Provenance legend**:\n  - *(documented)* \u2014 stated in ZooKeeper's own docs (Admin/Programmer guides, Reconfig guide, security page) or directly in code. Cited.\n  - *(maintainer)* \u2014 stated by a maintainer in response to this process. **None yet** (draft-first).\n  - *(inferred)* \u2014 reasoned from code structure or domain knowledge; has a matching \u00a714 open question.\n- **Draft confidence**: ~58 documented / 0 maintainer / ~22 inferred. This is a public-artifact draft; the `(inferred)` claims in \u00a73, \u00a75, \u00a77, \u00a79 are the priority confirmation targets.\n- **What ZooKeeper is**: A distributed coordination service. A small ensemble of servers (typically 3/5/7) replicates a hierarchical key/value namespace (the \"data tree\" of *znodes*) using the ZAB atomic-broadcast protocol, and serves reads, writes, watches, and ephemeral/sequential nodes to many client applications. It is the coordination backbone for systems like Kafka, HBase, and Solr \u2014 used for leader election, configuration, locks, and membership. It is a stateful network service, not an in-process library.\n\n## \u00a72 Scope and intended use\n\n- **Primary intended use**: An in-datacenter, operator-deployed coordination ensemble accessed by trusted application processes over a low-latency network. Concrete uses: configuration storage, distributed locks, leader election, group membership, service discovery.\n- **Deployment context**: Long-running server (`QuorumPeerMain` / `ZooKeeperServerMain`) plus an embedded Java client library. *(documented: zookeeperAdmin.md, zookeeperOver.md)*\n- **Caller roles** (a network service has no single \"caller\"):\n  - **Client application** \u2014 connects on the client port, holds sessions, reads/writes znodes. Authenticated only if the operator opts in (see \u00a76/\u00a77). *(documented)*\n  - **Operator / admin** \u2014 runs the JVMs, owns `zoo.cfg`, JAAS, keystores, super-user credentials, the AdminServer, JMX. Fully trusted for the instance. *(documented)*\n  - **Peer server** \u2014 another ensemble member participating in ZAB / leader election. Authenticated only if quorum SASL or quorum TLS is enabled (off by default). *(documented: QuorumPeerConfig.java)*\n- **Component-family table**:\n\n  | Family | Representative entry point | Touches outside process? | In model? |\n  | --- | --- | --- | --- |\n  | Server core (data tree, ACL check, ZAB, sessions) | client port 2181; `ZooKeeperServer`, `PrepRequestProcessor`, `DataTree` | network, disk (snapshots/txn log) | **In** |\n  | Quorum / leader election | ports 2888/3888; `QuorumPeer` | network, disk | **In** |\n  | Java client library | `ZooKeeper`, `ClientCnxn` | network | **In** |\n  | AdminServer (embedded Jetty) | HTTP 8080 `/commands/*` | network | **In** |\n  | Four-letter-words (4lw) | client port, telnet/nc | network | **In** |\n  | Jute serialization | `BinaryInputArchive` (wire decode) | \u2014 | **In** |\n  | Dynamic reconfig | `reconfig` API, `/zookeeper/config` | network | **In** |\n  | C/C++ client + C CLI (`cli_st`) | native lib | network | In (C client), see \u00a73 for CLI |\n  | `zookeeper-contrib/*` (REST, zooinspector, perl/python, zkfuse, \u2026) | varies | varies | **Out** \u2014 \u00a73 |\n  | `zookeeper-recipes/*` (locks, queues, election) | sample code | network | **Out** \u2014 \u00a73 |\n  | `zookeeper-specifications` (TLA+) | \u2014 | \u2014 | Out (formal spec, not shipped code) |\n\n  Anything marked **Out** reappears in \u00a73 with the reason.\n\n## \u00a73 Out of scope (explicit non-goals)\n\n- **Not a public-internet service.** \"A ZooKeeper ensemble is expected to operate in a trusted computing environment. It is thus recommended deploying ZooKeeper behind a firewall.\" *(documented: zookeeperAdmin.md \"Publicly accessible deployment\")* Threats that exist only because the ensemble was exposed to the open internet against this guidance are operator misconfiguration, not server vulnerabilities \u2014 but see \u00a714, this boundary needs a sharp ruling.\n- **Not an attempt to defend against a malicious in-process caller of the Java client.** A caller already inside the client JVM has the credentials and can do anything the session can; not a meaningful adversary at this layer. *(inferred)*\n- **No notion of znode ownership.** \"ZooKeeper does not have a notion of an owner of a znode.\" *(documented: zookeeperProgrammers.md)*\n- **ACLs are not recursive / not inherited.** An ACL pertains only to a specific znode, not its children. *(documented: zookeeperProgrammers.md)* This is design, not a bug (see \u00a79 false friends).\n- **Shipped-but-unsupported code, modelled separately or not at all:**\n  - `zookeeper-contrib/*` (REST gateway, ZooInspector, perl/python/C bindings, zkfuse, zktreeutil, loggraph, monitoring): separately authored, not part of the core security guarantee. A finding here is `OUT-OF-MODEL: unsupported-component` unless the PMC says otherwise. *(inferred \u2014 \u00a714)*\n  - `zookeeper-recipes/*`: reference implementations / sample code. *(inferred \u2014 \u00a714)*\n  - The C CLI shell (`cli_st`) is an operator tool, not a server surface; historically a buffer-overflow site (CVE-2016-5017). Whether the C client *library* is in core scope vs. the *CLI* being a tool needs a ruling. *(inferred \u2014 \u00a714)*\n- **Build/release/SDLC hygiene** (action pinning, signing, dependency freshness) is out per the skill's \u00a71.\n\n## \u00a74 Trust boundaries and data flow\n\n- **Primary trust boundary = the client port and the quorum port.** Bytes arriving on either are untrusted wire input and are decoded by Jute before any ACL logic runs.\n- **The data tree is the protected asset.** Access to a znode is mediated by `ZooKeeperServer.checkACL` *(code)*. Once a request passes the ACL check (or the node has a null/empty ACL, which short-circuits to *allow*), the operation executes. *(code: `ZooKeeperServer.java` checkACL)*\n- **Trust transitions:**\n  - Wire bytes \u2192 Jute decode (request-size bounded by `jute.maxbuffer`, default `0xfffff` \u2248 1 MB) \u2192 request object. *(code: `BinaryInputArchive.java`)*\n  - Request \u2192 session auth context (the set of `Id`s the connection has accumulated via `addAuthInfo` / SASL handshake / TLS cert).\n  - Request + auth context \u2192 `checkACL` against the **target node's own ACL** (for create/delete, against the **parent's** ACL). No walk up the tree. *(code: `PrepRequestProcessor.java`)*\n- **Reachability preconditions per family** (the test a triager applies before anything else):\n  - Server-core finding: in-model only if reachable from a request a client can actually send on the client port given the configured auth posture.\n  - Quorum finding: in-model only if reachable from a peer on the quorum port \u2014 and only `VALID` if quorum auth is *enabled* (else it requires an attacker already on the quorum network, see \u00a77).\n  - AdminServer finding: in-model only if reachable via HTTP on the admin port.\n  - Jute finding: in-model only if reachable from attacker-supplied wire bytes within the `jute.maxbuffer` bound.\n\n## \u00a75 Assumptions about the environment\n\n- **Network**: A trusted, firewalled datacenter network. Client, quorum, election, admin, and JMX ports are assumed reachable only by intended parties. *(documented: Admin guide deployment section)*\n- **Quorum integrity**: A strict majority of configured servers are honest and available; ZAB safety/liveness assume crash-fault (not Byzantine) peers. *(inferred \u2014 classic ZAB assumption \u2014 \u00a714)*\n- **Disk / persistence**: The snapshot directory and transaction log are on storage the operator controls and trusts; on-disk data is **not encrypted by ZooKeeper** and snapshots/txn logs contain znode data (including digest ACL hashes) in the clear. *(inferred \u2014 \u00a714)*\n- **Clock**: Session expiry and leader-election timeouts rely on reasonably synchronized, monotonic-enough clocks; `tickTime` drives liveness. *(documented: tickTime in Admin guide; inferred for the security implication)*\n- **JVM / runtime**: A conformant JRE. `fips-mode` (`zookeeper.fips-mode`) default **true** on 3.9.0+, **false** on 3.8.x; when enabled the custom `ZKTrustManager` is disabled, so **quorum hostname verification is not available** (client-server still is). *(documented: zookeeperAdmin.md)*\n- **What ZooKeeper does to / does not do to its host** (mostly negative claims, hence mostly `(inferred)` \u2014 high-priority \u00a714 confirmation targets):\n  - It *does* open listening sockets (client, quorum, election, admin HTTP, optional JMX/metrics), read `zoo.cfg` and system properties, read JAAS config and keystores, and write snapshots/txn logs to disk. *(documented/code)*\n  - It honors a number of `zookeeper.*` **system properties** that change the security envelope at JVM start (`skipACL`, `superUser`, `4lw.commands.whitelist`, `ssl.*`, etc.) \u2014 the security posture is partly a function of the JVM command line, not just `zoo.cfg`. *(documented)*\n  - It does *not* spawn child processes or shell out as part of request handling. *(inferred \u2014 \u00a714)*\n  - The `ip` scheme over the AdminServer HTTP path honors a client-supplied `X-Forwarded-For` header. *(code: `IPAuthenticationProvider`)* \u2014 this is the root of CVE-2024-51504.\n\n## \u00a75a Build-time and configuration variants (the security envelope is a function of config)\n\nZooKeeper is really a *family* of deployment postures. The knobs below change which \u00a78 properties hold. Most ship in the **less-secure default**, which is the central tension this model must resolve with the PMC (\u00a714).\n\n| Knob | Default | Effect on model | Maintainer stance |\n| --- | --- | --- | --- |\n| `zookeeper.skipACL` | `no` | `yes` disables **all** ACL checks \u2192 full data-tree access for any client; also unauthenticated `reconfig`. *(documented)* | dev-only? \u00a714 |\n| Client authentication (`OPEN_ACL_UNSAFE` nodes, no auth provider) | open | With no ACLs/auth, any client reads/writes any node. *(code/docs)* | \u00a714 |\n| `zookeeper.sessionRequireClientSASLAuth` | `false` | `true` rejects non-SASL-authenticated clients. *(documented)* | \u00a714 |\n| `zookeeper.enforce.auth.enabled` + `enforce.auth.schemes` | `false` | `true` requires the listed auth schemes per session. *(documented)* | \u00a714 |\n| `sslQuorum` (`zookeeper.sslQuorum`) | `false` | quorum + election traffic plaintext &amp; unauthenticated unless on. *(documented)* | \u00a714 |\n| `quorum.auth.serverRequireSasl` / `learnerRequireSasl` / `enableSasl` | `false` | quorum SASL off; peers unauthenticated. *(code: QuorumPeerConfig)* | \u00a714 |\n| `secureClientPort` / client TLS | unset | client traffic plaintext unless configured. *(documented)* | \u00a714 |\n| `ssl.clientAuth` / `ssl.quorum.clientAuth` | `need` | when TLS *is* on, client cert is required (good default). *(documented)* | \u2014 |\n| `ssl.hostnameVerification` / `ssl.quorum.hostnameVerification` | `true` | disabling enables MITM; \"only recommended for testing.\" *(documented)* | \u2014 |\n| `reconfigEnabled` | `false` | `true` allows runtime membership change via API. *(documented)* | \u00a714 |\n| `4lw.commands.whitelist` | only `srvr` (+`isro` if RO mode) | wildcard `*` exposes all 4lw incl. info-disclosing/expensive ones. *(documented/code)* | \u2014 |\n| `admin.enableServer` | `true` | embedded Jetty AdminServer on `0.0.0.0:8080`, most commands unauthenticated. *(documented/code)* | \u00a714 |\n| `admin.needClientAuth` / `admin.forceHttps` / `admin.portUnification` | `false` | admin HTTP is plaintext, no client auth, unless set. *(documented)* | \u00a714 |\n| `zookeeper.DigestAuthenticationProvider.digestAlg` | `SHA1` | unsalted; \"will be deprecated for security issues.\" *(documented)* | \u2014 |\n| `fips-mode` | `true` (3.9+) / `false` (3.8) | when true, disables `ZKTrustManager` \u2192 no quorum hostname verification. *(documented)* | \u2014 |\n\n**Insecure-default ruling needed (wave 1, \u00a714):** for every row marked \"\u00a714\", the model is ambiguous until the PMC rules whether the default *is* the supported production posture (report against it = `VALID`) or a dev-convenience operators must flip (= `OUT-OF-MODEL: non-default-build`, and the requirement moves to \u00a710).\n\n## \u00a76 Assumptions about inputs\n\nZooKeeper accepts: (a) client requests on the client port, (b) quorum/election messages on the quorum ports, (c) HTTP requests on the AdminServer, (d) 4lw strings on the client port, (e) `zoo.cfg` / dynamic config / system properties from the operator (trusted).\n\n**Per-parameter trust table** (route/message-oriented, as for a network service):\n\n| Surface / message | Parameter | Attacker-controllable? | Caller/operator must enforce |\n| --- | --- | --- | --- |\n| Client request (any op) | request bytes / path / data | **yes** (untrusted client) | size \u2264 `jute.maxbuffer`; ACLs on nodes; enable auth if clients are untrusted |\n| `create` / `setData` | znode `data` blob | **yes** | quota (`setquota`); znode size; ACL on parent |\n| `create`/`setACL` | ACL list | **yes** | meaningful (non-`OPEN_ACL_UNSAFE`) ACLs if multi-tenant |\n| `getChildren` / watches | path, watch registration | **yes** | READ perm on node (note CVE-2024-23944 persistent-watcher leak) |\n| `addAuthInfo` (digest) | `user:password` | **yes** | sent **cleartext** \u2192 use only over TLS/localhost *(documented)* |\n| `ip` scheme | source IP | **spoofable** (and `X-Forwarded-For` on HTTP path) | network-level controls; do not rely on `ip` across untrusted nets |\n| Quorum / election message | peer wire bytes | **yes if quorum net is reachable** | enable quorum SASL/TLS; firewall quorum ports |\n| AdminServer | HTTP path `/commands/`, headers | **yes if admin port reachable** | restrict admin port; `needClientAuth`+TLS for x509 |\n| 4lw | 4-char command string | **yes if whitelisted** | keep whitelist minimal; firewall |\n| `zoo.cfg`, JAAS, keystores, `zookeeper.*` sysprops | all | **no \u2014 operator-trusted** | protect these files / JVM args |\n\n- **Size / shape**: A single request/response is bounded by `jute.maxbuffer` (default ~1 MB; `BinaryInputArchive.maxBuffer`), with `zookeeper.jute.maxbuffer.extrasize` slack. *(code)* `maxClientCnxns` default 60 per IP; `maxCnxns` default 0 (unlimited). *(documented)*\n\n## \u00a77 Adversary model\n\n- **In-scope adversary: the unauthenticated/low-privilege network client** that can reach the client port. Capabilities: open sessions, send arbitrary (size-bounded) requests, register watches, run whitelisted 4lw, attempt auth. Goals: read/modify data it should not, exhaust resources, crash a server, leak session/config info. *(inferred \u2014 \u00a714)*\n- **In-scope adversary: HTTP client reaching the AdminServer** \u2014 can invoke unauthenticated commands (info disclosure) and, where snapshot/restore auth is weak, more. *(documented via CVE-2025-58457 / CVE-2024-51504 patterns)*\n- **Authenticated-but-Byzantine peer** *(distributed-system actor)*: a server that holds a legitimate quorum identity, passes the handshake, then behaves arbitrarily. ZAB tolerates **crash faults** of a minority, **not Byzantine faults.** Safety holds while a strict majority are honest and available; with `&gt; n/2` faulty/colluding peers, the model breaks. The complement (`\u2265` majority Byzantine, or a forged-identity peer when quorum auth is off) is **out of scope** \u2192 \u00a73. *(inferred \u2014 \u00a714; this threshold must be stated by the PMC)*\n- **Out of scope:**\n  - An attacker with operator privileges (owns `zoo.cfg`, JAAS, keystores, disk, JVM args). They have already won. *(inferred \u2014 \u00a714)*\n  - An attacker already on the quorum/election network when quorum auth is *off* \u2014 this is the documented \"trusted network\" assumption (\u00a75); such an attacker can impersonate a peer. A report requiring this is `OUT-OF-MODEL: adversary-not-in-scope` **only if** the PMC affirms quorum-off is a supported posture (else it is `VALID`). *(inferred \u2014 \u00a714)*\n  - Side-channel / timing / co-tenant memory adversaries against the JVM. *(inferred \u2014 \u00a714)*\n  - On-disk attacker reading snapshots/txn logs (no at-rest encryption claim). *(inferred \u2014 \u00a714)*\n\n## \u00a78 Security properties the project provides\n\nEach property states: the property + conditions, violation symptom, severity tier, provenance. **All `(inferred)` here are proposals pending PMC confirmation (\u00a714).**\n\n1. **ACL-mediated access control on individual znodes.** *Conditions*: a non-trivial ACL is set on the node, the matching auth provider is configured, and `skipACL=no`. A request lacking the required permission is rejected with `NoAuth`. *Symptom of violation*: a client reads/writes a node without holding the required `Id`/permission. *Severity*: **security-critical** (auth bypass \u2192 CVE). *(documented/code: `checkACL`)*\n2. **Authentication via configured schemes** (`digest`, `sasl`/Kerberos, `x509`/mTLS). *Conditions*: provider enabled; for `digest`, channel confidentiality is the caller's job. *Symptom*: identity accepted that should not be (e.g. SASL quorum bypass = CVE-2023-44981). *Severity*: **security-critical**. *(documented/code)*\n3. **Optional transport confidentiality &amp; integrity (TLS)** for client-server (`secureClientPort`) and quorum (`sslQuorum`), with hostname verification on by default and client-cert auth `need` by default *when TLS is enabled*. *Symptom*: plaintext exposure or accepted MITM cert (cf. CVE-2026-24281, CVE hostname-bypass). *Severity*: **security-critical** when TLS is configured and the guarantee is breached. *(documented)*\n4. **Replicated-state safety (linearizable writes, FIFO client order) under ZAB**, given a majority of honest, available, crash-only servers. *Symptom*: divergent state across replicas / lost or reordered committed writes / a fork. *Severity*: **security-critical** if reachable by an in-scope adversary; otherwise correctness. *(inferred \u2014 \u00a714; this is ZAB's documented design but the security framing needs PMC sign-off)*\n5. **Bounded per-request size.** Requests/responses exceeding `jute.maxbuffer` (default ~1 MB) are rejected at decode. *Symptom*: unbounded allocation from a single message. *Severity*: security-relevant DoS guard. *(code)*\n6. **Connection rate limiting per source IP** via `maxClientCnxns` (default 60). *Symptom*: single IP exhausts connection slots. *Severity*: partial DoS guard. *(documented)*\n7. **Quota accounting with optional hard enforcement.** Soft quota *warns only*; hard quota throws `QuotaExceededException`. *Symptom*: a tenant exceeds count/byte limits silently (soft) \u2014 by design. *Severity*: correctness / operational, not a memory-safety guarantee. *(documented: zookeeperQuotas.md)*\n8. **Audit logging** of mutating operations when enabled (`ZKAuditProvider`). *Symptom*: a mutating op leaves no audit trail when auditing is on. *Severity*: detective control, not preventive. *(code: `audit/`)*\n\n&gt; Resource note: ZooKeeper makes **no general guarantee of bounded total memory or CPU under adversarial request mixes** beyond the per-request `jute.maxbuffer` cap and connection limits. The data tree is held **in memory**; aggregate size is bounded by quotas only if hard quotas are set. See \u00a79. *(inferred \u2014 \u00a714)*\n\n## \u00a79 Security properties the project does *NOT* provide\n\nThis is the highest-value section for an integrator. State plainly:\n\n- **No secure-by-default posture.** Out of the box, with `OPEN_ACL_UNSAFE` nodes and no auth provider, **any client that reaches the port can read and write the entire data tree.** Authentication and ACLs are **opt-in**. *(documented/code)*\n- **No on-the-wire encryption by default.** Client and quorum traffic are plaintext unless TLS is explicitly configured. `digest` credentials travel **in cleartext**. *(documented)*\n- **No quorum authentication by default.** Peer and leader-election traffic is unauthenticated unless quorum SASL or quorum TLS is enabled. *(documented/code)*\n- **No at-rest encryption.** Snapshots and transaction logs store znode data (and digest ACL hashes) unencrypted. *(inferred \u2014 \u00a714)*\n- **No Byzantine fault tolerance.** A malicious peer holding a valid identity, or a majority of faulty peers, can violate safety. *(inferred \u2014 \u00a714)*\n- **No defense against a malicious/over-privileged client beyond ACLs.** ZooKeeper will not protect node A from client B if B holds (or can spoof) the credentials A's ACL trusts.\n- **No general resource-exhaustion guarantee.** Watches, connections, and tree size driven by clients can pressure heap/CPU; only per-request size, per-IP connection count, and (opt-in) hard quotas bound this. Expensive 4lw (`wchc`/`wchp`/`dump`) and large watch sets are a known DoS vector (CVE-2017-5637). *(documented/inferred \u2014 \u00a714)*\n\n**False friends (features mistaken for security primitives):**\n\n- **`ip` authentication scheme** \u2014 looks like authentication; it is **source-IP matching**, trivially spoofable on an untrusted network, and over the AdminServer HTTP path it trusts a client-supplied `X-Forwarded-For` header. Not an authentication boundary across untrusted networks. *(code; CVE-2024-51504)*\n- **`digest` scheme** \u2014 looks like password auth; the password is sent **cleartext** and stored as an **unsalted SHA1** hash. Confidentiality is entirely the transport's job. *(documented)*\n- **ACLs are NOT recursive** \u2014 setting a restrictive ACL on `/app` does **not** protect `/app/status`; a world-readable child stays world-readable. An integrator who \"locks down a subtree\" by ACL-ing the root has not locked anything below it. *(documented)*\n- **`world:anyone`** \u2014 the conventional default `Id`; means *no* access control.\n- **Quotas are not a security boundary by default** \u2014 soft quota only logs; it does not stop a tenant from filling memory. *(documented)*\n- **The `auth` scheme** in `setACL` grants to *whatever the setter authenticated as*, not a named principal \u2014 easy to mis-reason about. *(documented)*\n\n**Well-known attack classes left to the operator/integrator** (one line each, integrator-on-notice):\n\n- **DoS via expensive operations / large watch sets / connection floods** \u2014 mitigate with whitelist, `maxClientCnxns`, quotas, firewalling.\n- **Information disclosure via 4lw and AdminServer** (`envi`, `conf`, `cons`, `dump`, `mntr`) \u2014 unauthenticated when reachable.\n- **MITM on plaintext channels** \u2014 mitigate with TLS + hostname verification.\n- **Credential theft on plaintext `digest`** \u2014 mitigate with TLS.\n- **Reconfig abuse** (adding a rogue server / removing quorum members) when `reconfigEnabled=true` and ACLs/auth are weak. *(documented: zookeeperReconfig.md)*\n\n## \u00a710 Downstream responsibilities (the operator's contract)\n\nFor ZooKeeper \"user\" = **operator/deployer**. To make the \u00a75\u2013\u00a77 assumptions hold:\n\n1. **Deploy inside a trusted, firewalled network.** Do not expose client (2181), quorum (2888), election (3888), admin (8080), JMX, or metrics ports to untrusted networks. *(documented)*\n2. **If clients are not fully trusted**, set meaningful ACLs (not `OPEN_ACL_UNSAFE`) **and** enable authentication (`sessionRequireClientSASLAuth` or `enforce.auth.*`). ACLs alone on `OPEN_ACL_UNSAFE` trees protect nothing.\n3. **Set ACLs on every sensitive node**, including children \u2014 ACLs do not inherit.\n4. **Use TLS** (`secureClientPort` / `sslQuorum`) whenever traffic crosses a link you do not fully trust; never disable hostname verification in production. Use `digest` only over TLS or localhost.\n5. **Enable quorum authentication** (quorum SASL and/or quorum TLS) \u2014 it is off by default.\n6. **Restrict the AdminServer**: disable it (`admin.enableServer=false`) if unused, or bind/firewall it and require client auth + HTTPS; assume most commands are unauthenticated otherwise.\n7. **Keep the 4lw whitelist minimal**; do not set `*` on an exposed instance.\n8. **Leave `skipACL=no`** in any environment where the client port is reachable by untrusted parties.\n9. **Protect `zoo.cfg`, JAAS files, keystores, super-user digests, and the snapshot/txn-log directory** at the OS level \u2014 these are the operator-trusted inputs and the unencrypted data at rest.\n10. **Guard `reconfigEnabled`** and the `/zookeeper/config` write ACL; keep reconfig off unless needed and authenticated.\n\n## \u00a711 Known misuse patterns\n\n- **Treating ZooKeeper as internet-facing.** Binding to a public interface with default config exposes the whole data tree. *What it looks like*: 2181 open to the world. *Why unsafe*: no auth by default. *Instead*: firewall + auth + TLS.\n- **`OPEN_ACL_UNSAFE` in production multi-tenant trees.** *Why unsafe*: any client can mutate/delete. *Instead*: per-node ACLs + auth.\n- **ACL-ing only the root of a subtree.** Relying on a parent ACL to protect children. *Instead*: set ACLs on each node; ACLs are not recursive.\n- **Relying on the `ip` scheme across untrusted networks**, or behind a proxy that sets `X-Forwarded-For`. *Instead*: SASL/x509.\n- **`digest` over plaintext.** Leaks credentials. *Instead*: TLS or localhost only.\n- **Leaving quorum auth/TLS off on a shared network.** Allows peer impersonation / reconfig abuse. *Instead*: quorum SASL + TLS.\n- **Setting `4lw.commands.whitelist=*` or exposing the AdminServer** for convenience. Leaks config/session/watch data. *Instead*: minimal whitelist, restricted admin port.\n- **`skipACL=yes` for throughput.** Opens full data-tree access (and unauthenticated reconfig). *Instead*: never with an untrusted client port.\n\n## \u00a711a Known non-findings (recurring false positives)\n\nPatterns tools/researchers repeatedly report that are **not** bugs under this model (cite the licensing section):\n\n- **\"Anyone can read/write \u2014 no auth!\"** against a default config \u2014 by design; auth/ACLs are opt-in per \u00a79, and exposure presupposes the operator violated \u00a710.1. \u2192 `BY-DESIGN` / `OUT-OF-MODEL: trusted-input` depending on framing (PMC to set, \u00a714).\n- **\"`world:anyone` ACL is insecure.\"** It is the documented open ACL; using it is a \u00a711 misuse, not a server defect. \u2192 `KNOWN-NON-FINDING` (\u00a79).\n- **\"Child znode readable despite restrictive parent ACL.\"** ACLs are not recursive by design. \u2192 `BY-DESIGN` (\u00a79). *(documented)*\n- **\"`digest` password sent in cleartext.\"** Documented; transport security is the caller's job. \u2192 `BY-DESIGN` (\u00a79). *(documented)*\n- **\"4lw / AdminServer command leaks environment/config.\"** Whitelisted/enabled by the operator; exposure presupposes \u00a710 was violated. \u2192 `OUT-OF-MODEL: trusted-input` or `KNOWN-NON-FINDING` (PMC, \u00a714).\n- **\"Quorum traffic unauthenticated/plaintext.\"** Default posture; trusted-network assumption (\u00a75). Whether reportable depends on the \u00a714 quorum-default ruling.\n- **\"Unsalted SHA1 in digest hash.\"** Known/documented limitation; \u2192 `BY-DESIGN` pending the planned algorithm migration. *(documented)*\n\n&gt; Note: this list assumes the PMC affirms the \"trusted network / opt-in auth\" posture. If instead the PMC declares secure-by-default the supported posture, several of these flip to `VALID`. **This single ruling reshapes \u00a78/\u00a79/\u00a711a/\u00a713 \u2014 it is the top \u00a714 question.**\n\n## \u00a712 Conditions that would change this model\n\n- A new public API, request type, or wire-format change.\n- A new network surface (new listener/port) or a change in a default (e.g. another `fips-mode`-style flip, enabling auth by default, AdminServer auth-by-default).\n- Promotion of a `contrib/` or `recipes/` component into supported core.\n- Adoption of at-rest encryption, BFT, or secure-by-default \u2014 any of which would rewrite \u00a78/\u00a79.\n- **Evidence of incompleteness**: any report that cannot be routed to exactly one \u00a713 disposition is a `MODEL-GAP` and triggers a revision (add the property to \u00a78/\u00a79), not an ad-hoc call.\n\n## \u00a713 Triage dispositions (closed set)\n\n| Disposition | Meaning | Licensed by |\n| --- | --- | --- |\n| `VALID` | Violates a claimed property via an in-scope adversary and input. | \u00a78, \u00a76, \u00a77 |\n| `VALID-HARDENING` | No \u00a78 property violated, but a \u00a711 misuse is easy enough to warrant hardening. Private report, fixed at maintainer discretion, usually no CVE. | \u00a711 |\n| `OUT-OF-MODEL: trusted-input` | Requires attacker control of an input the model marks trusted (e.g. `zoo.cfg`, operator sysprops, a node the report assumes should be auth'd but the operator left open). | \u00a76 |\n| `OUT-OF-MODEL: adversary-not-in-scope` | Requires a capability the model excludes (operator privilege, on-quorum-network when quorum-off is supported, side channel, on-disk). | \u00a77 |\n| `OUT-OF-MODEL: unsupported-component` | Lands in `contrib/`, `recipes/`, or other \u00a73 code. | \u00a73 |\n| `OUT-OF-MODEL: non-default-build` | Manifests only under a discouraged/non-default \u00a75a knob the PMC ruled dev-only. | \u00a75a |\n| `BY-DESIGN: property-disclaimed` | Concerns a \u00a79-disclaimed property (non-recursive ACLs, cleartext digest, no at-rest encryption, etc.). | \u00a79 |\n| `KNOWN-NON-FINDING` | Matches a \u00a711a recurring false positive. | \u00a711a |\n| `MODEL-GAP` | Cannot be cleanly routed above \u2192 revise the model. | triggers \u00a712 |\n\n## \u00a714 Open questions for the maintainers\n\nGrouped in waves; each states a proposed answer to confirm/correct. **Wave 1 reshapes the most.**\n\n**Wave 1 \u2014 the secure-by-default question (reshapes \u00a73, \u00a77, \u00a78, \u00a79, \u00a711a, \u00a713):**\n1. *Proposed:* \"The supported production posture is **opt-in security on a trusted, firewalled network**: default open ACLs / no client auth / plaintext / quorum-auth-off are *supported defaults*, and a report that merely observes them (without the operator following \u00a710) is `OUT-OF-MODEL: trusted-input`, not `VALID`.\" Confirm, or declare secure-by-default the supported posture. (Lands: \u00a73, \u00a77, \u00a79, \u00a711a.)\n2. *Proposed:* \"An attacker on the **quorum/election network while quorum auth is off** is **out of scope** (trusted-network assumption).\" Confirm, or rule quorum-off a dev-only posture (then peer-impersonation reports become `VALID`). (Lands: \u00a75a, \u00a77, \u00a713.)\n3. *Proposed:* \"`skipACL=yes` is **dev/benchmark-only**; reports requiring it are `OUT-OF-MODEL: non-default-build`.\" Confirm. (Lands: \u00a75a, \u00a710, \u00a713.)\n\n**Wave 2 \u2014 scope of shipped code:**\n4. *Proposed:* \"`zookeeper-contrib/*` and `zookeeper-recipes/*` are **out of the core security model** (separately authored / sample code); findings are `OUT-OF-MODEL: unsupported-component`.\" Confirm per-component, especially the **REST gateway** and **ZooInspector**. (Lands: \u00a72, \u00a73.)\n5. *Proposed:* \"The **C client library** is in core scope, but the **C CLI shell (`cli_st`)** is an operator tool, not a server surface.\" Confirm. (Lands: \u00a72, \u00a73.)\n\n**Wave 3 \u2014 environment / negative claims (mostly `(inferred)`):**\n6. *Proposed:* \"ZooKeeper performs **no at-rest encryption**; protecting the snapshot/txn-log directory is an operator responsibility.\" Confirm. (Lands: \u00a75, \u00a79, \u00a710.)\n7. *Proposed:* \"Request handling **never spawns child processes / shells out**.\" Confirm. (Lands: \u00a75.)\n8. *Proposed:* \"ZAB assumes **crash-fault, non-Byzantine** peers; safety holds with a strict honest majority (`&gt; n/2` honest).\" Confirm the threshold wording for \u00a77/\u00a78. (Lands: \u00a77, \u00a78.)\n\n**Wave 4 \u2014 property severities and DoS line:**\n9. *Proposed:* \"There is **no general bounded-memory/CPU guarantee** under adversarial request mixes beyond `jute.maxbuffer` + `maxClientCnxns` + opt-in hard quotas; a hang or super-linear blowup driven by a single bounded request *is* a bug, but heap pressure from many legitimate-looking requests/watches is an operational concern, not a CVE.\" Confirm the DoS line for \u00a78/\u00a79/\u00a713. (Lands: \u00a78, \u00a79.)\n10. *Proposed:* \"Replicated-state **safety violations (forks, lost committed writes)** are security-critical; **liveness/availability** degradation under load is correctness/operational.\" Confirm severity tiers. (Lands: \u00a78.)\n\n**Wave 5 \u2014 meta:**\n11. Where should this document live and how should it be versioned with releases \u2014 `zookeeper-docs/.../threatModel.md` published on the site, or repo-root draft? Should it be linked from `security.html`?\n12. Is there any existing internal threat-model or security-design note (PMC-private) this should be reconciled against as a `(documented)` source?\n\n## \u00a715 Optional: machine-readable companion\n\nA sidecar (`threat-model.yaml`) for automated triage is **deferred until \u00a714 wave 1 is resolved**, since the disposition of the default-posture findings determines almost every `in_scope` flag. Once ratified, emit: entry points \u2192 per-parameter trust (from \u00a76); component families \u2192 in/out (\u00a72/\u00a73); \u00a75a knobs \u2192 security-relevant/default/stance; \u00a78 properties \u2192 severity + violation symptom; \u00a79 disclaimed + false friends; \u00a711a non-findings; \u00a713 labels. The prose remains canonical.\n\n---\n\n### Appendix \u2014 `security.html` / docs back-map (coverage proof)\n\n| Source statement | Lands in |\n| --- | --- |\n| \"report to security@zookeeper.apache.org before disclosing\" | \u00a71 reporting cross-ref |\n| \"expected to operate in a trusted computing environment \u2026 behind a firewall\" | \u00a73, \u00a75, \u00a710.1 |\n| \"ACLs are not recursive\" | \u00a73, \u00a79 false friends |\n| \"no notion of an owner of a znode\" | \u00a73 |\n| digest \"sent in clear text\" / \"plaintext\" | \u00a76, \u00a79 |\n| `skipACL` \"opens up full access to the data tree to everyone\" | \u00a75a, \u00a79, \u00a710 |\n| reconfig \"any unauthenticated users can use reconfig API\" (with skipACL) | \u00a75a, \u00a711 |\n| `4lw.commands.whitelist` default `srvr` | \u00a75a, \u00a76, \u00a711a |\n| `fips-mode` disables quorum hostname verification | \u00a75, \u00a75a |\n| quota soft vs hard semantics | \u00a78 |\n| CVE history (IP-auth bypass, SASL quorum bypass, persistent-watcher leak, 4lw DoS, hostname-verification bypass) | informs \u00a77, \u00a79, \u00a711a (patterns, not a CVE list) |", "creation_timestamp": "2026-06-18T19:11:29.000000Z"}