GHSA-89C6-VPCJ-7VJ4

Vulnerability from github – Published: 2026-05-18 20:11 – Updated: 2026-05-18 20:11
VLAI
Summary
OpenTelemetry eBPF Instrumentation: Unbounded BPF internal metrics replay can exhaust CPU
Details

Summary

OBI replays BPF probe hits into histogram observations by looping once per recorded run count. On busy systems, the run-count delta can become very large, causing the metrics exporter to spend excessive CPU time in a tight loop every collection interval.

Details

The vulnerable loop is in pkg/export/prom/prom_bpf.go. During each metrics tick, OBI iterates through probeMetrics and then executes for range metric.count, invoking BpfProbeLatency(...) for each individual recorded hit.

The count comes from calculateStats() in the same file, where deltaCount := bp.runCount - bp.prevRunCount is calculated and returned without any cap before the per-hit replay loop.

If probe activity spikes between scrape intervals, deltaCount can be very large. The exporter then spends CPU time proportional to the number of probe hits rather than the number of metric series.

PoC

Local testing with a small reproducer confirmed the replay-loop behavior and showed CPU scaling with the recorded hit count rather than the number of metric series.

Use a vulnerable build and enable internal metrics export:

git checkout v0.0.0-rc.1+build
make build
export OTEL_EBPF_INTERNAL_METRICS_PROMETHEUS_PORT=9090
sudo ./bin/obi

Create a high-rate workload that repeatedly exercises traced probes. For example, generate HTTP traffic against an instrumented service:

python3 -m http.server 18081

Then drive it:

seq 1 500000 | xargs -P 128 -I{} curl -s http://127.0.0.1:18081 >/dev/null

At the same time, scrape metrics repeatedly:

while true; do curl -s http://127.0.0.1:9090/metrics >/dev/null; done

On a vulnerable build, OBI CPU consumption rises sharply during the metrics loop because histogram updates are replayed once per counted probe execution. The effect is visible in top or pidstat and is most pronounced under sustained high request volume.

Impact

This is an availability issue in the internal metrics path. Any deployment that enables BPF internal metrics and traces busy workloads is affected. Attackers can indirectly consume CPU in the privileged agent by driving enough activity through instrumented services.

Show details on source website

{
  "affected": [
    {
      "package": {
        "ecosystem": "Go",
        "name": "go.opentelemetry.io/obi"
      },
      "ranges": [
        {
          "events": [
            {
              "introduced": "0"
            },
            {
              "fixed": "0.9.0"
            }
          ],
          "type": "ECOSYSTEM"
        }
      ]
    }
  ],
  "aliases": [
    "CVE-2026-45680"
  ],
  "database_specific": {
    "cwe_ids": [
      "CWE-400",
      "CWE-834"
    ],
    "github_reviewed": true,
    "github_reviewed_at": "2026-05-18T20:11:13Z",
    "nvd_published_at": null,
    "severity": "MODERATE"
  },
  "details": "### Summary\n\nOBI replays BPF probe hits into histogram observations by looping once per recorded run count. On busy systems, the run-count delta can become very large, causing the metrics exporter to spend excessive CPU time in a tight loop every collection interval.\n\n### Details\n\nThe vulnerable loop is in [pkg/export/prom/prom_bpf.go](https://github.com/open-telemetry/opentelemetry-ebpf-instrumentation/blob/4a39d3b307968df4b54e89b8dee297e7d772ca29/pkg/export/prom/prom_bpf.go#L128-L144). During each metrics tick, OBI iterates through `probeMetrics` and then executes `for range metric.count`, invoking `BpfProbeLatency(...)` for each individual recorded hit.\n\nThe count comes from [`calculateStats()`](https://github.com/open-telemetry/opentelemetry-ebpf-instrumentation/blob/4a39d3b307968df4b54e89b8dee297e7d772ca29/pkg/export/prom/prom_bpf.go#L326-L335) in the same file, where `deltaCount := bp.runCount - bp.prevRunCount` is calculated and returned without any cap before the per-hit replay loop.\n\nIf probe activity spikes between scrape intervals, `deltaCount` can be very large. The exporter then spends CPU time proportional to the number of probe hits rather than the number of metric series.\n\n### PoC\n\nLocal testing with a small reproducer confirmed the replay-loop behavior and showed CPU scaling with the recorded hit count rather than the number of metric series.\n\nUse a vulnerable build and enable internal metrics export:\n\n```bash\ngit checkout v0.0.0-rc.1+build\nmake build\nexport OTEL_EBPF_INTERNAL_METRICS_PROMETHEUS_PORT=9090\nsudo ./bin/obi\n```\n\nCreate a high-rate workload that repeatedly exercises traced probes. For example, generate HTTP traffic against an instrumented service:\n\n```bash\npython3 -m http.server 18081\n```\n\nThen drive it:\n\n```bash\nseq 1 500000 | xargs -P 128 -I{} curl -s http://127.0.0.1:18081 \u003e/dev/null\n```\n\nAt the same time, scrape metrics repeatedly:\n\n```bash\nwhile true; do curl -s http://127.0.0.1:9090/metrics \u003e/dev/null; done\n```\n\nOn a vulnerable build, OBI CPU consumption rises sharply during the metrics loop because histogram updates are replayed once per counted probe execution. The effect is visible in `top` or `pidstat` and is most pronounced under sustained high request volume.\n\n### Impact\n\nThis is an availability issue in the internal metrics path. Any deployment that enables BPF internal metrics and traces busy workloads is affected. Attackers can indirectly consume CPU in the privileged agent by driving enough activity through instrumented services.",
  "id": "GHSA-89c6-vpcj-7vj4",
  "modified": "2026-05-18T20:11:13Z",
  "published": "2026-05-18T20:11:13Z",
  "references": [
    {
      "type": "WEB",
      "url": "https://github.com/open-telemetry/opentelemetry-ebpf-instrumentation/security/advisories/GHSA-89c6-vpcj-7vj4"
    },
    {
      "type": "PACKAGE",
      "url": "https://github.com/open-telemetry/opentelemetry-ebpf-instrumentation"
    }
  ],
  "schema_version": "1.4.0",
  "severity": [
    {
      "score": "CVSS:3.1/AV:N/AC:H/PR:N/UI:N/S:U/C:N/I:N/A:H",
      "type": "CVSS_V3"
    }
  ],
  "summary": "OpenTelemetry eBPF Instrumentation: Unbounded BPF internal metrics replay can exhaust CPU"
}


Log in or create an account to share your comment.




Tags
Taxonomy of the tags.


Loading…

Loading…

Loading…

Forecast uses a logistic model when the trend is rising, or an exponential decay model when the trend is falling. Fitted via linearized least squares.

Sightings

Author Source Type Date Other

Nomenclature

  • Seen: The vulnerability was mentioned, discussed, or observed by the user.
  • Confirmed: The vulnerability has been validated from an analyst's perspective.
  • Published Proof of Concept: A public proof of concept is available for this vulnerability.
  • Exploited: The vulnerability was observed as exploited by the user who reported the sighting.
  • Patched: The vulnerability was observed as successfully patched by the user who reported the sighting.
  • Not exploited: The vulnerability was not observed as exploited by the user who reported the sighting.
  • Not confirmed: The user expressed doubt about the validity of the vulnerability.
  • Not patched: The vulnerability was not observed as successfully patched by the user who reported the sighting.

Loading…

Detection rules are retrieved from Rulezet.

Loading…

Loading…