Optimize Kafka topic filtering for large clusters (370x+ speedup) by andpol · Pull Request #1866 · kafbat/kafka-ui

andpol · 2026-05-29T21:24:18Z

Breaking change? (if so, please describe the impact and migration path for existing application instances)

What changes did you make? (Give an overview)

topicStateMap called filterTopic once per topic for each offsets/stats map, each doing an O(P_total) scan — total O(T * P_total) per scrape. On large clusters this was the CPU hotspot behind slow UI.

Fix: group each cluster-wide map by topic once (O(P_total) total), then do O(1) lookups in the per-topic loop.

Measured speedup (partitions = 10 * topics, median per call):
1K topics: 373ms -> 1ms (~370x)
3K topics: 3.2s -> 4ms (~800x)
10K topics: 61s -> 14ms (~4400x)

Is there anything you'd like reviewers to focus on?

Review correctness of changes, and effectiveness of tests, as I'm not very familiar with Kafka UI code.

How Has This Been Tested? (put an "x" (case-sensitive!) next to an item)

No need to
Manually (please, describe, if necessary) - I deployed to our staging environment. The 1.5.0 release of Kafka UI is very obviously much slower at opening the topics page (and other things). I also ran the attached, but not committed benchmarking ScrapedClusterStatePerfTest.java (place it in api/src/test/java/io/kafbat/ui/service/metrics/scrape/ScrapedClusterStatePerfTest.java if you want to run). See results above.
Unit checks - new tests to validate correctness of touched functions. They pass before my changes, and also after.
Integration checks
Covered by existing automation

Checklist (put an "x" (case-sensitive!) next to all the items, otherwise the build will fail)

I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation (e.g. ENVIRONMENT VARIABLES)
[x2] My changes generate no new warnings (e.g. Sonar is happy)
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes
Any dependent changes have been merged

Check out Contributing and Code of Conduct

A picture of a cute animal (not mandatory but encouraged)

Summary by CodeRabbit

Refactor
- Optimized internal metrics data processing logic for improved efficiency
Tests
- Expanded test coverage for metrics scraping and state validation to ensure reliability

Fixes kafbat#1776 topicStateMap called filterTopic once per topic for each offsets/stats map, each doing an O(P_total) scan — total O(T * P_total) per scrape. On large clusters this was the CPU hotspot behind slow UI. Fix: group each cluster-wide map by topic once (O(P_total) total), then do O(1) lookups in the per-topic loop. Measured speedup (partitions = 10 * topics, median per call): 1K topics: 373ms -> 1ms (~370x) 3K topics: 3.2s -> 4ms (~800x) 10K topics: 61s -> 14ms (~4400x)

kapybro · 2026-05-29T21:24:27Z

AI Summary

The GitHub issue addresses a performance bottleneck in Kafka UI, where the topicStateMap function performed an inefficient O(T * P_total) scan for each topic, causing slow UI responses on large Kafka clusters. The fix optimizes the process by grouping cluster-wide maps by topic once (O(P_total) total) and using O(1) lookups, resulting in significant speedups (e.g., 370x for 1K topics, 4400x for 10K topics). The change was tested manually in staging and with unit tests, and reviewers are asked to verify correctness and test effectiveness.

coderabbitai · 2026-05-29T21:24:33Z

📝 Walkthrough

Walkthrough

ScrapedClusterState refactors how it builds per-topic state maps by replacing per-topic filtering and Optional-wrapped handling with a new groupByTopic helper that converts TopicPartition → value maps into nested topic → partition → value structures. The topicStateMap method visibility changes to package-private, the unused Optional import is removed, and test coverage validates the refactored behavior end-to-end.

Changes

ScrapedClusterState refactoring with groupByTopic helper

Layer / File(s)	Summary
groupByTopic helper foundation `api/src/main/java/io/kafbat/ui/service/metrics/scrape/ScrapedClusterState.java`	New `groupByTopic(Map<TopicPartition, T>)` generic utility converts `TopicPartition`-keyed maps into nested `topic -> partition -> value` structure for reuse across offset and stats lookups.
topicStateMap refactoring and cleanup `api/src/main/java/io/kafbat/ui/service/metrics/scrape/ScrapedClusterState.java`	`topicStateMap(...)` visibility changes to package-private `static`, unused `java.util.Optional` import is removed, and method implementation refactored to use grouped lookup maps via `groupByTopic` instead of per-topic filtering and `Optional`-wrapped partition stats.
Test coverage for refactored topicStateMap `api/src/test/java/io/kafbat/ui/service/metrics/scrape/ScrapedClusterStateTest.java`	New test `topicStateMapGroupsOffsetsAndStatsPerTopic` builds synthetic partition/topic stats and metadata, invokes the refactored `topicStateMap`, and validates returned topic-state map contains expected keys with correct descriptions, configs, start/end offsets, segment stats, and partition-to-segment mappings. Adds reflection-based test helpers to construct `InternalLogDirStats` and a `TopicDescription` factory.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Suggested labels

type/enhancement, scope/backend, status/triage/completed, area/internal

Poem

A rabbit refactors with care,
groupByTopic groups everywhere,
No Optionals linger—
Just nested map fingers,
Per-topic lookups now bright and fair. 🐰✨

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Linked Issues check	⚠️ Warning	The PR addresses only a micro-optimization in topic lookup but does not implement the core requirements from issue `#1776`: making metadata/metrics fetching non-blocking, decoupling from request threads, throttling/staggering requests, or providing configurable tuning knobs.	Implement the primary objectives from `#1776`: decouple blocking fetch operations from request-handling threads, add async/reactive patterns or dedicated worker pools, throttle/paginate metadata requests, and provide configurable tuning parameters for large clusters.
Docstring Coverage	⚠️ Warning	Docstring coverage is 10.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Out of Scope Changes check	✅ Passed	All changes are scoped to the topic lookup optimization: refactoring ScrapedClusterState to use groupByTopic instead of filterTopic, and corresponding test updates. No unrelated changes detected.
Title check	✅ Passed	The title accurately describes the main optimization refactoring: replacing per-topic filtering loops with grouped map lookups for Kafka topic state processing.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions

Hi andpol! 👋

Welcome, and thank you for opening your first PR in the repo!

Please wait for triaging by our maintainers.

Please take a look at our contributing guide.

andpol requested a review from a team as a code owner May 29, 2026 21:24

kapybro Bot added status/triage/manual Manual triage in progress and removed status/triage/manual Manual triage in progress labels May 29, 2026

kapybro Bot changed the title ~~Improve perf w/ large Kafka clusters~~ Speed up UI by optimizing topic lookup in large Kafka clusters May 29, 2026

github-actions Bot reviewed May 29, 2026

View reviewed changes

kapybro Bot added status/triage/manual Manual triage in progress and removed status/triage/manual Manual triage in progress labels May 31, 2026

kapybro Bot changed the title ~~Speed up UI by optimizing topic lookup in large Kafka clusters~~ Optimize Kafka topic filtering for large clusters (370x+ speedup) May 31, 2026

kapybro Bot added area/topics impact/changelog A PR with changes which should be addressed in the changelog explicitly scope/backend Related to backend changes type/enhancement En enhancement/improvement to an already existing feature labels May 31, 2026

kafbat deleted a comment from kapybro Bot May 31, 2026

Haarolean added this to the 1.6 milestone May 31, 2026

github-project-automation Bot added this to Release 1.6 May 31, 2026

github-project-automation Bot moved this to Todo in Release 1.6 May 31, 2026

Haarolean requested a review from germanosin June 16, 2026 14:04

Haarolean moved this from Todo to In Review in Release 1.6 Jun 16, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Optimize Kafka topic filtering for large clusters (370x+ speedup)#1866

Optimize Kafka topic filtering for large clusters (370x+ speedup)#1866
andpol wants to merge 1 commit into
kafbat:mainfrom
andpol:issues/1776

andpol commented May 29, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

kapybro Bot commented May 29, 2026

Uh oh!

coderabbitai Bot commented May 29, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Suggested labels

Poem

❌ Failed checks (2 warnings)

Uh oh!

github-actions Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

andpol commented May 29, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

kapybro Bot commented May 29, 2026

Uh oh!

coderabbitai Bot commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Suggested labels

Poem

❌ Failed checks (2 warnings)

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

andpol commented May 29, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 29, 2026 •

edited

Loading