Skip to content

BE: fix isolate JMX metrics by label schema to prevent dropped broker me…#1824

Open
bantnd wants to merge 1 commit into
kafbat:mainfrom
bantnd:bugfix/jmx-broker-metrics
Open

BE: fix isolate JMX metrics by label schema to prevent dropped broker me…#1824
bantnd wants to merge 1 commit into
kafbat:mainfrom
bantnd:bugfix/jmx-broker-metrics

Conversation

@bantnd

@bantnd bantnd commented Apr 30, 2026

Copy link
Copy Markdown
  • Breaking change? (if so, please describe the impact and migration path for existing application instances)

What changes did you make? (Give an overview)
Modified RawMetric.groupIntoSnapshot to group metrics using a composite key of sanitizedName + ":" + labels. By doing this, metrics with the same name but different label dimensions (e.g., broker-level vs. topic-level) are placed into separate, dedicated Gauge instances. This prevents topic-level metrics from overwriting and dropping broker-level metrics when they share the same metric name, successfully fixing the 0 Bytes issue on the dashboard.
Is there anything you'd like reviewers to focus on?
Just the mapping logic in RawMetric.java.
How Has This Been Tested? (put an "x" (case-sensitive!) next to an item)

  • No need to
  • Manually (please, describe, if necessary)
  • Unit checks
  • Integration checks
  • Covered by existing automation

Checklist (put an "x" (case-sensitive!) next to all the items, otherwise the build will fail)

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation (e.g. ENVIRONMENT VARIABLES)
  • My changes generate no new warnings (e.g. Sonar is happy)
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged
    Check out Contributing and Code of Conduct

A picture of a cute animal (not mandatory but encouraged)

Summary by CodeRabbit

Bug Fixes

  • Enhanced Prometheus metrics handling to correctly process metrics with varying label configurations and ensure accurate gauge assignment.

…trics

Modified RawMetric.groupIntoSnapshot to use a composite key of name + labels. This prevents topic-level metrics (2 labels) from overwriting and dropping broker-level metrics (1 label) when they share the same metric name, fixing the 0 Bytes issue on the dashboard.
@bantnd bantnd requested a review from a team as a code owner April 30, 2026 09:33
@kapybro kapybro Bot added status/triage Issues pending maintainers triage status/triage/manual Manual triage in progress status/triage/completed Automatic triage completed and removed status/triage Issues pending maintainers triage labels Apr 30, 2026
@coderabbitai

coderabbitai Bot commented Apr 30, 2026

Copy link
Copy Markdown
📝 Walkthrough

Walkthrough

The metric snapshot grouping now uses a composite key combining sanitized metric name and label schema to index Prometheus Gauge instances. The label-schema mismatch check has been removed, allowing label-value assignment to proceed unconditionally for distinct gauges per metric name and label schema combination.

Changes

Cohort / File(s) Summary
Prometheus Gauge Grouping
api/src/main/java/io/kafbat/ui/service/metrics/RawMetric.java
Changed snapshot grouping to use composite key (metric name + label schema) instead of metric name alone; removed label-schema mismatch prevention check allowing unconditional label-value assignment for distinct gauges per (name, schema) combination.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Suggested labels

scope/backend, type/bug, area/internal

Suggested reviewers

  • Haarolean

Poem

🐰 The metrics now hop with grace,
Composite keys find their place,
No schema clashes block the way,
Each gauge shines bright today! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly identifies the main change: fixing JMX metrics isolation by label schema to prevent dropped broker metrics, which matches the core objective of the PR.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
Review rate limit: 7/8 reviews remaining, refill in 7 minutes and 30 seconds.

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi bantnd! 👋

Welcome, and thank you for opening your first PR in the repo!

Please wait for triaging by our maintainers.

Please take a look at our contributing guide.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
api/src/main/java/io/kafbat/ui/service/metrics/RawMetric.java (1)

29-36: ⚡ Quick win

Make label schema derivation deterministic before building mapKey.

On Line 40, the schema key depends on m.labels() iteration order. Two logically identical label sets with different map order can produce different mapKey values and get split into separate gauges. Consider canonicalizing labels (e.g., sort by sanitized label name) and deriving both lbls and lblVals from that same ordered sequence.

Proposed refactor
+import java.util.Map.Entry;
+import java.util.Comparator;
 ...
-      var lbls = m.labels().keySet()
-          .stream()
-          .map(PrometheusNaming::sanitizeLabelName)
-          .toArray(String[]::new);
-      var lblVals = m.labels().keySet()
-          .stream()
-          .map(l -> m.labels().get(l))
-          .toArray(String[]::new);
+      var orderedLabels = m.labels().entrySet().stream()
+          .map(e -> Map.entry(PrometheusNaming.sanitizeLabelName(e.getKey()), e.getValue()))
+          .sorted(Comparator.comparing(Entry::getKey))
+          .toList();
+      var lbls = orderedLabels.stream().map(Entry::getKey).toArray(String[]::new);
+      var lblVals = orderedLabels.stream().map(Entry::getValue).toArray(String[]::new);
       var sanitizedName = PrometheusNaming.sanitizeMetricName(m.name());

Also applies to: 40-44

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/src/main/java/io/kafbat/ui/service/metrics/RawMetric.java` around lines
29 - 36, The label schema derivation for building mapKey is non-deterministic
because it uses m.labels() iteration order; change the logic in the block that
produces lbls and lblVals so you first derive a single ordered list of label
keys sorted by PrometheusNaming::sanitizeLabelName (e.g., collect
m.labels().keySet(), map to sanitized names paired with original keys, sort by
sanitized name), then build both lbls and lblVals from that same ordered
sequence so that mapKey is stable across different map iteration orders.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@api/src/main/java/io/kafbat/ui/service/metrics/RawMetric.java`:
- Around line 29-36: The label schema derivation for building mapKey is
non-deterministic because it uses m.labels() iteration order; change the logic
in the block that produces lbls and lblVals so you first derive a single ordered
list of label keys sorted by PrometheusNaming::sanitizeLabelName (e.g., collect
m.labels().keySet(), map to sanitized names paired with original keys, sort by
sanitized name), then build both lbls and lblVals from that same ordered
sequence so that mapKey is stable across different map iteration orders.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: b9c66a22-7ffc-4ae4-81a3-87ed2ef7c166

📥 Commits

Reviewing files that changed from the base of the PR and between 4bf06dd and eeeaaf4.

📒 Files selected for processing (1)
  • api/src/main/java/io/kafbat/ui/service/metrics/RawMetric.java

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

status/triage/completed Automatic triage completed status/triage/manual Manual triage in progress

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant