fix: gracefully handle ClusterAuthorizationException in quorum info#1808
fix: gracefully handle ClusterAuthorizationException in quorum info#1808wserr wants to merge 20 commits into
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughloadQuorumInfo now returns QuorumInfo directly; getStatistics maps quorum results into KRaft, recovers UnsupportedVersionException → ZooKeeperClusterInfo and ClusterAuthorizationException → UnknownClusterInfo; createStats accepts ClusterInfo, sets controller type, and populates quorumInfo only for KRaftClusterInfo. Changes
Sequence Diagram(s)(Skipped — changes are localized service logic and enum update without a multi-component new control-flow requiring visualization.) Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related issues
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Hi wserr! 👋
Welcome, and thank you for opening your first PR in the repo!
Please wait for triaging by our maintainers.
Please take a look at our contributing guide.
|
I rebased the original PR #1679 onto the latest FYI - @CoderYellow I cherry-picked your commit |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@api/src/main/java/io/kafbat/ui/service/StatisticsService.java`:
- Around line 40-43: The current allowlist QUORUM_INFO_IGNORABLE_EXCEPTIONS in
StatisticsService includes ClusterAuthorizationException which leads the quorum
lookup to return Optional.empty() and then default to ControllerType.ZOOKEEPER
(via the current mapping logic), incorrectly reporting KRaft clusters as
ZOOKEEPER when authorization fails; change the logic so that
ClusterAuthorizationException does not collapse into ZOOKEEPER—either add an
explicit UNKNOWN/UNAVAILABLE enum value to ControllerType and map authorization
failures to that value, or modify the quorum-detection flow to preserve the
lookup failure reason (e.g., return a distinct Optional/Result type or a
separate flag) so QUORUM_INFO_IGNORABLE_EXCEPTIONS (and
ClusterAuthorizationException) are treated as “unavailable/unauthorized” rather
than ZOOKEEPER, and update any callers of the method that assume only
ZOOKEEPER/KRAFT accordingly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: d9c0712c-3d75-477a-b774-e09cb91d9fff
📒 Files selected for processing (1)
api/src/main/java/io/kafbat/ui/service/StatisticsService.java
Haarolean
left a comment
There was a problem hiding this comment.
please address coderabbit comments
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@api/src/main/java/io/kafbat/ui/service/StatisticsService.java`:
- Line 108: The current mapping always sets .controller(quorumInfo.isPresent() ?
KRAFT : UNAVAILABLE) which conflates UnsupportedVersionException and
ClusterAuthorizationException; update the code that handles quorumInfo
exceptions to branch on the exception type so UnsupportedVersionException maps
to ControllerType.ZOOKEEPER while ClusterAuthorizationException (and other
non-Zookeeper failures) map to ControllerType.UNAVAILABLE, i.e., detect
UnsupportedVersionException and call .controller(ZOOKEEPER) otherwise use
.controller(UNAVAILABLE) so the UI can distinguish KRaft vs ZooKeeper.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 3a90a3a7-9fae-4069-8056-97e8785d6bef
📒 Files selected for processing (2)
api/src/main/java/io/kafbat/ui/service/StatisticsService.javacontract-typespec/api/clusters.tsp
✅ Files skipped from review due to trivial changes (1)
- contract-typespec/api/clusters.tsp
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (3)
api/src/main/java/io/kafbat/ui/service/StatisticsService.java (3)
45-68: Remove stray;after type declarations.The trailing
;on lines 47, 54, 61, 68 are empty member declarations — legal Java but unusual and inconsistent with the rest of the file. Worth dropping for hygiene.🧹 Proposed cleanup
private sealed interface ClusterInfo { ControllerType getType(); - }; + } private record KRaftClusterInfo(QuorumInfo quorumInfo) implements ClusterInfo { `@Override` public ControllerType getType() { return KRAFT; } - }; + } private record ZooKeeperClusterInfo() implements ClusterInfo { `@Override` public ControllerType getType() { return ZOOKEEPER; } - }; + } private record UnknownClusterInfo() implements ClusterInfo { `@Override` public ControllerType getType() { return UNKNOWN; } - }; + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/src/main/java/io/kafbat/ui/service/StatisticsService.java` around lines 45 - 68, The code contains unnecessary trailing semicolons after the sealed interface and record/type declarations (ClusterInfo, KRaftClusterInfo, ZooKeeperClusterInfo, UnknownClusterInfo); remove those stray semicolons to match project style and avoid the odd empty-member declaration (i.e., delete the semicolon after the closing brace of each of the listed declarations).
134-137: Brace style inconsistent with the rest of the file.The rest of
StatisticsService(and the surrounding codebase) uses K&R-style braces; the Allman-style{on its own line here stands out. Move it to the same line as theiffor consistency.🧹 Proposed cleanup
- if (clusterInfo instanceof KRaftClusterInfo(QuorumInfo quorumInfo)) - { + if (clusterInfo instanceof KRaftClusterInfo(QuorumInfo quorumInfo)) { stats.quorumInfo(quorumInfoMapper.toInternalQuorumInfo(quorumInfo)); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/src/main/java/io/kafbat/ui/service/StatisticsService.java` around lines 134 - 137, The if-statement in StatisticsService uses Allman-style braces; update the KRaftClusterInfo pattern match block so the opening brace is on the same line as the if (K&R style) to match the file's style—i.e., change the block around the KRaftClusterInfo(QuorumInfo quorumInfo) pattern that calls stats.quorumInfo(quorumInfoMapper.toInternalQuorumInfo(quorumInfo)) so the '{' sits immediately after the closing parenthesis of the if.
108-111:loadQuorumInfois now a trivial passthrough — consider inlining.After this change,
loadQuorumInfo(ac)is equivalent toac.describeMetadataQuorum(). Keeping the indirection adds a hop without any added behavior. Inlining it at the single call site (line 80) would reduce surface area; alternatively, keep it only if you expect to add retry/logging logic here later.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/src/main/java/io/kafbat/ui/service/StatisticsService.java` around lines 108 - 111, The private helper loadQuorumInfo(ReactiveAdminClient ac) is a trivial passthrough; replace its single call site that invokes loadQuorumInfo(ac) with ac.describeMetadataQuorum() directly and delete the private static Mono<QuorumInfo> loadQuorumInfo method to remove the unnecessary indirection (or, if you intend to keep it for future behavior, add a comment explaining the planned extension).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@api/src/main/java/io/kafbat/ui/service/StatisticsService.java`:
- Line 4: The OpenAPI spec is out of sync with the TypeSpec enum so
ControllerType.UNKNOWN is missing; rebuild the contract artifacts to regenerate
the OpenAPI YAML and Java model classes so the static import
ControllerType.UNKNOWN resolves. Re-run the contract build that generates the
OpenAPI from contract-typespec (ensuring the TypeSpec enum in
contract-typespec/api/clusters.tsp includes UNKNOWN), then regenerate the API
client/models and commit the updated openapi/kafbat-ui-api.yaml and regenerated
model sources so usages of ControllerType.UNKNOWN compile.
---
Nitpick comments:
In `@api/src/main/java/io/kafbat/ui/service/StatisticsService.java`:
- Around line 45-68: The code contains unnecessary trailing semicolons after the
sealed interface and record/type declarations (ClusterInfo, KRaftClusterInfo,
ZooKeeperClusterInfo, UnknownClusterInfo); remove those stray semicolons to
match project style and avoid the odd empty-member declaration (i.e., delete the
semicolon after the closing brace of each of the listed declarations).
- Around line 134-137: The if-statement in StatisticsService uses Allman-style
braces; update the KRaftClusterInfo pattern match block so the opening brace is
on the same line as the if (K&R style) to match the file's style—i.e., change
the block around the KRaftClusterInfo(QuorumInfo quorumInfo) pattern that calls
stats.quorumInfo(quorumInfoMapper.toInternalQuorumInfo(quorumInfo)) so the '{'
sits immediately after the closing parenthesis of the if.
- Around line 108-111: The private helper loadQuorumInfo(ReactiveAdminClient ac)
is a trivial passthrough; replace its single call site that invokes
loadQuorumInfo(ac) with ac.describeMetadataQuorum() directly and delete the
private static Mono<QuorumInfo> loadQuorumInfo method to remove the unnecessary
indirection (or, if you intend to keep it for future behavior, add a comment
explaining the planned extension).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 268ba87a-6eb7-4d08-95a4-5001d0b66d09
📒 Files selected for processing (2)
api/src/main/java/io/kafbat/ui/service/StatisticsService.javacontract-typespec/api/clusters.tsp
🚧 Files skipped from review as they are similar to previous changes (1)
- contract-typespec/api/clusters.tsp
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (4)
api/src/main/java/io/kafbat/ui/service/StatisticsService.java (4)
100-109: Simplify with a pattern-matching switch.Since the project already uses record deconstruction patterns (line 137), the
instanceofchain here can be expressed more concisely as a Java 21 switch expression, which also makes the intent (exception-to-ClusterInfo mapping) more self-documenting.♻️ Proposed refactor
`@NotNull` private static Mono<? extends ClusterInfo> handleQuorumInfoErrors(Throwable quorumInfoFetchError) { - if (quorumInfoFetchError instanceof UnsupportedVersionException) { - return Mono.just(new ZooKeeperClusterInfo()); - } - else if (quorumInfoFetchError instanceof ClusterAuthorizationException) { - return Mono.just(new UnknownClusterInfo()); - } - return Mono.error(quorumInfoFetchError); + return switch (quorumInfoFetchError) { + case UnsupportedVersionException ignored -> Mono.just(new ZooKeeperClusterInfo()); + case ClusterAuthorizationException ignored -> Mono.just(new UnknownClusterInfo()); + default -> Mono.error(quorumInfoFetchError); + }; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/src/main/java/io/kafbat/ui/service/StatisticsService.java` around lines 100 - 109, Replace the instanceof chain in handleQuorumInfoErrors with a Java 21 pattern-matching switch expression on the Throwable parameter: match UnsupportedVersionException to return Mono.just(new ZooKeeperClusterInfo()), match ClusterAuthorizationException to return Mono.just(new UnknownClusterInfo()), and default to Mono.error(quorumInfoFetchError); this makes the mapping from exception types to ClusterInfo implementations (ZooKeeperClusterInfo, UnknownClusterInfo) more concise and self-documenting.
100-109: Consider logging the authorization failure for observability.When
ClusterAuthorizationExceptionis swallowed intoUnknownClusterInfo, there is no breadcrumb for operators investigating why a cluster's controller type shows asUNKNOWN. Alog.warn(orlog.debugif too noisy) at the recovery site would help diagnose misconfigured ACLs on Confluent Cloud without impacting the graceful-degradation behavior.💡 Example
private static Mono<? extends ClusterInfo> handleQuorumInfoErrors(Throwable quorumInfoFetchError) { if (quorumInfoFetchError instanceof UnsupportedVersionException) { return Mono.just(new ZooKeeperClusterInfo()); } else if (quorumInfoFetchError instanceof ClusterAuthorizationException) { + log.warn("Quorum info lookup not authorized; controller type will be reported as UNKNOWN", quorumInfoFetchError); return Mono.just(new UnknownClusterInfo()); } return Mono.error(quorumInfoFetchError); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/src/main/java/io/kafbat/ui/service/StatisticsService.java` around lines 100 - 109, In handleQuorumInfoErrors, add a log statement to record ClusterAuthorizationException before returning UnknownClusterInfo so operators have visibility; specifically, detect when quorumInfoFetchError is an instance of ClusterAuthorizationException (as already checked) and call log.warn (or log.debug if preferred) with a concise message and the exception (e.g., "authorization fetching quorum info, treating as UNKNOWN") before returning Mono.just(new UnknownClusterInfo()); ensure you use the existing logger instance (log) in StatisticsService to avoid introducing new logging utilities.
111-114:loadQuorumInfois now a trivial delegate.After the Optional/exception handling was lifted out, this method is just a rename of
describeMetadataQuorum. It's fine to keep for readability at the call site, but inlining it would also be reasonable now. Not blocking.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/src/main/java/io/kafbat/ui/service/StatisticsService.java` around lines 111 - 114, The method loadQuorumInfo in StatisticsService is now a trivial delegate to ReactiveAdminClient.describeMetadataQuorum(); remove this indirection by deleting loadQuorumInfo and updating its callers to call ReactiveAdminClient.describeMetadataQuorum() directly (or, alternatively, keep loadQuorumInfo if you prefer the readability at call sites)—ensure you update all references to loadQuorumInfo to use describeMetadataQuorum() and remove the unused private method.
81-82: Minor: explicit cast can be avoided.The
(ClusterInfo)cast is only needed becausemapinfers the narrowerKRaftClusterInfotype, which then conflicts with theMono<? extends ClusterInfo>returned byonErrorResume. You can drop the cast by supplying the type witness or a typed lambda parameter — purely stylistic.💡 Alternatives
- loadQuorumInfo(ac) - .map(quorumInfo -> (ClusterInfo)new KRaftClusterInfo(quorumInfo)) - .onErrorResume(StatisticsService::handleQuorumInfoErrors))) + loadQuorumInfo(ac) + .<ClusterInfo>map(KRaftClusterInfo::new) + .onErrorResume(StatisticsService::handleQuorumInfoErrors)))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/src/main/java/io/kafbat/ui/service/StatisticsService.java` around lines 81 - 82, Remove the unnecessary explicit cast by changing the map call to publish the desired target type; for example replace .map(quorumInfo -> (ClusterInfo)new KRaftClusterInfo(quorumInfo)) with .<ClusterInfo>map(quorumInfo -> new KRaftClusterInfo(quorumInfo)) (or use a typed lambda parameter) so the map returns ClusterInfo directly and stays compatible with the subsequent onErrorResume(StatisticsService::handleQuorumInfoErrors); update the invocation around KRaftClusterInfo, map and handleQuorumInfoErrors in StatisticsService accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@api/src/main/java/io/kafbat/ui/service/StatisticsService.java`:
- Around line 45-68: Remove the unnecessary trailing semicolons after the type
declarations: delete the extra ";" following the sealed interface ClusterInfo
and after each record declaration KRaftClusterInfo, ZooKeeperClusterInfo, and
UnknownClusterInfo so the declarations end with "}" only; this removes the empty
statements SonarCloud flagged while keeping the interface and record definitions
intact.
---
Nitpick comments:
In `@api/src/main/java/io/kafbat/ui/service/StatisticsService.java`:
- Around line 100-109: Replace the instanceof chain in handleQuorumInfoErrors
with a Java 21 pattern-matching switch expression on the Throwable parameter:
match UnsupportedVersionException to return Mono.just(new
ZooKeeperClusterInfo()), match ClusterAuthorizationException to return
Mono.just(new UnknownClusterInfo()), and default to
Mono.error(quorumInfoFetchError); this makes the mapping from exception types to
ClusterInfo implementations (ZooKeeperClusterInfo, UnknownClusterInfo) more
concise and self-documenting.
- Around line 100-109: In handleQuorumInfoErrors, add a log statement to record
ClusterAuthorizationException before returning UnknownClusterInfo so operators
have visibility; specifically, detect when quorumInfoFetchError is an instance
of ClusterAuthorizationException (as already checked) and call log.warn (or
log.debug if preferred) with a concise message and the exception (e.g.,
"authorization fetching quorum info, treating as UNKNOWN") before returning
Mono.just(new UnknownClusterInfo()); ensure you use the existing logger instance
(log) in StatisticsService to avoid introducing new logging utilities.
- Around line 111-114: The method loadQuorumInfo in StatisticsService is now a
trivial delegate to ReactiveAdminClient.describeMetadataQuorum(); remove this
indirection by deleting loadQuorumInfo and updating its callers to call
ReactiveAdminClient.describeMetadataQuorum() directly (or, alternatively, keep
loadQuorumInfo if you prefer the readability at call sites)—ensure you update
all references to loadQuorumInfo to use describeMetadataQuorum() and remove the
unused private method.
- Around line 81-82: Remove the unnecessary explicit cast by changing the map
call to publish the desired target type; for example replace .map(quorumInfo ->
(ClusterInfo)new KRaftClusterInfo(quorumInfo)) with .<ClusterInfo>map(quorumInfo
-> new KRaftClusterInfo(quorumInfo)) (or use a typed lambda parameter) so the
map returns ClusterInfo directly and stays compatible with the subsequent
onErrorResume(StatisticsService::handleQuorumInfoErrors); update the invocation
around KRaftClusterInfo, map and handleQuorumInfoErrors in StatisticsService
accordingly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 77d213df-7bdd-4222-895c-1f05caca631a
📒 Files selected for processing (1)
api/src/main/java/io/kafbat/ui/service/StatisticsService.java
|
@Haarolean the coderabbit remarks have been processed. |
Haarolean
left a comment
There was a problem hiding this comment.
the overall problem with this approach is that you're solving your own problem with confluent clusters w/o considering a broader impact on self-hosted users who would benefit from seeing the actual error (ClusterAuthorizationException due to ACL issues).
I suggest to log a warning on ClusterAuthorizationExceptions
|
@germanosin @Haarolean is there another review required before completing? |
|
Do you have an idea when this could be merged? |
|
@Haarolean @germanosin is there anything still required from my side? |
pending German's review |
|
Hi @germanosin could you please have a look again? |
|
@germanosin can you check this again, please? |
|
Looks like it wont be merge anytime soon |
|
@cemusta Indeed, I would like to know why |
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
|
@Haarolean are you planning on merging this PR soon if I fix the code coverage issues? |
|
@wserr i don't think coverage is a blocker here, it's already in a queue to be merged. Thanks for your contribution! |
closes #1672
What changes did you make? (Give an overview)
Is there anything you'd like reviewers to focus on?
How Has This Been Tested? (put an "x" (case-sensitive!) next to an item)
We spun up this fix locally against a confluent cloud cluster and we verified that we can now connect to the cluster
Checklist (put an "x" (case-sensitive!) next to all the items, otherwise the build will fail)
Check out Contributing and Code of Conduct
A picture of a cute animal (not mandatory but encouraged)
Summary by CodeRabbit
New Features
Bug Fixes
Refactor