Optimize Kafka topic filtering for large clusters (370x+ speedup)#1866
Optimize Kafka topic filtering for large clusters (370x+ speedup)#1866andpol wants to merge 1 commit into
Conversation
Fixes kafbat#1776 topicStateMap called filterTopic once per topic for each offsets/stats map, each doing an O(P_total) scan — total O(T * P_total) per scrape. On large clusters this was the CPU hotspot behind slow UI. Fix: group each cluster-wide map by topic once (O(P_total) total), then do O(1) lookups in the per-topic loop. Measured speedup (partitions = 10 * topics, median per call): 1K topics: 373ms -> 1ms (~370x) 3K topics: 3.2s -> 4ms (~800x) 10K topics: 61s -> 14ms (~4400x)
|
AI Summary The GitHub issue addresses a performance bottleneck in Kafka UI, where the |
📝 WalkthroughWalkthroughScrapedClusterState refactors how it builds per-topic state maps by replacing per-topic filtering and Optional-wrapped handling with a new ChangesScrapedClusterState refactoring with groupByTopic helper
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Suggested labels
Poem
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Hi andpol! 👋
Welcome, and thank you for opening your first PR in the repo!
Please wait for triaging by our maintainers.
Please take a look at our contributing guide.
What changes did you make? (Give an overview)
Fixes #1776
topicStateMap called filterTopic once per topic for each offsets/stats map, each doing an O(P_total) scan — total O(T * P_total) per scrape. On large clusters this was the CPU hotspot behind slow UI.
Fix: group each cluster-wide map by topic once (O(P_total) total), then do O(1) lookups in the per-topic loop.
Measured speedup (partitions = 10 * topics, median per call):
1K topics: 373ms -> 1ms (~370x)
3K topics: 3.2s -> 4ms (~800x)
10K topics: 61s -> 14ms (~4400x)
Is there anything you'd like reviewers to focus on?
Review correctness of changes, and effectiveness of tests, as I'm not very familiar with Kafka UI code.
How Has This Been Tested? (put an "x" (case-sensitive!) next to an item)
api/src/test/java/io/kafbat/ui/service/metrics/scrape/ScrapedClusterStatePerfTest.javaif you want to run). See results above.Checklist (put an "x" (case-sensitive!) next to all the items, otherwise the build will fail)
Check out Contributing and Code of Conduct
A picture of a cute animal (not mandatory but encouraged)
Summary by CodeRabbit
Refactor
Tests