[GSOC 2026] Adding logic to validate and generate an error if there are key…#38992
[GSOC 2026] Adding logic to validate and generate an error if there are key…#38992HansMarcus01 wants to merge 3 commits into
Conversation
…s linked to a service account that was not rotated by our system
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request enhances the compliance checker script by introducing automated detection of unauthorized service account keys. By validating active IAM keys against the registry of managed secrets, the system can now proactively identify and flag rogue keys, providing a critical security layer for the infrastructure. Highlights
New Features🧠 You can now enable Memory (public preview) to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a mechanism to detect rogue service account keys by comparing user-managed keys in IAM against legal keys stored in Secret Manager. It adds helper methods to retrieve keys from both sources and integrates this check into the compliance workflow. Additionally, exception logging in the main entry point is improved to include stack traces. A critical bug was identified in _get_legal_keys_from_secret_manager where the account_email parameter is unused, and self.secret_id is incorrectly used instead, causing the same secret to be queried for all service accounts. A code suggestion is provided to fix this by using the correct secret name parameter.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| def _get_legal_keys_from_secret_manager(self, account_email: str) -> List[str]: | ||
| """ | ||
| Retrieves the list of legal keys for a given service account from Secret Manager. | ||
|
|
||
| Args: | ||
| account_email (str): The email of the service account to retrieve keys for. | ||
|
|
||
| Returns: | ||
| List[str]: A list of key IDs for the legal keys associated with the service account. | ||
| """ | ||
| legal_keys = [] | ||
| parent = self.secret_client.secret_path(self.project_id, self.secret_id) | ||
|
|
||
| try: | ||
| versions = self.secret_client.list_secret_versions(request={"parent": parent}) | ||
| for version in versions: | ||
| if version.state.name == secretmanager.SecretVersion.State.ENABLED: | ||
| response = self.secret_client.access_secret_version(request={"name": version.name}) | ||
| data_str = response.payload.data.decode("UTF-8") | ||
| key_id = data_str.split(":",1)[0] | ||
| legal_keys.append(key_id) | ||
| return legal_keys | ||
| except Exception as e: | ||
| self.logger.error(f"Failed to retrieve legal keys from Secret Manager for account '{account_email}': {e}") | ||
| return [] |
There was a problem hiding this comment.
In _get_legal_keys_from_secret_manager, the parameter account_email is defined but never used. Instead, self.secret_id is used to construct the secret path. This is a critical bug because it will query the same secret (whatever self.secret_id is set to) for every service account, rather than querying the specific secret for each service account (e.g., secret_name which is passed to this method).
To fix this, we should rename the parameter to secret_name to match its usage and use it when constructing the secret path.
def _get_legal_keys_from_secret_manager(self, secret_name: str) -> List[str]:
"""
Retrieves the list of legal keys for a given service account from Secret Manager.
Args:
secret_name (str): The name of the secret to retrieve keys for.
Returns:
List[str]: A list of key IDs for the legal keys associated with the service account.
"""
legal_keys = []
parent = self.secret_client.secret_path(self.project_id, secret_name)
try:
versions = self.secret_client.list_secret_versions(request={"parent": parent})
for version in versions:
if version.state.name == secretmanager.SecretVersion.State.ENABLED:
response = self.secret_client.access_secret_version(request={"name": version.name})
data_str = response.payload.data.decode("UTF-8")
key_id = data_str.split(":",1)[0]
legal_keys.append(key_id)
return legal_keys
except Exception as e:
self.logger.error(f"Failed to retrieve legal keys from Secret Manager for secret '{secret_name}': {e}")
return []|
Checks are failing. Will not request review until checks are succeeding. If you'd like to override that behavior, comment |
pabloem
left a comment
There was a problem hiding this comment.
make sure to reference the Beam issue we are trying to fix (what number is it?)
also, show an example of the output from running this, and what are next steps after this PR.
|
|
||
| def _get_legal_keys_from_secret_manager(self, secret_name: str) -> List[str]: | ||
| """ | ||
| Retrieves the list of legal keys for a given service account from Secret Manager. |
There was a problem hiding this comment.
what does legal mean here? Can you define 'lega' or improve the naming?
| legal_keys = self._get_legal_keys_from_secret_manager(secret_name) | ||
| rogue_keys = set(iam_keys) - set(legal_keys) | ||
| for rogue_key in rogue_keys: | ||
| msg = f"SECURITY ALERT: Rogue key '{rogue_key}' detected on account '{service_account}'. this key was created outside the rotation system." |
There was a problem hiding this comment.
let's not call it 'rogue' - let's call it 'unmanaged'. and instead of 'rotation system', let's call it "Beam's service account management system (in infra/keys/)".
…ed by Beam's rotation system
This Pull Request introduces the core detection logic to identify rogue service account keys in the apache-beam-testing infrastructure.
Issue: GSOC-273
Project Details
Currently, it is difficult to determine if a service account key was created manually outside the automated rotation system. To solve this, I have modify a compliance checker script (account_keys.py) that performs the following actions:
State Generation (--action generate): Connects to the GCP IAM API and Secret Manager to document the current live service accounts and managed secrets, updating the local state file (keys.yaml).
Compliance Detection (--action check): Uses set logic to compare the physical keys existing in IAM against the legal, managed keys registered in Secret Manager.
Security Alerting: Successfully identifies any rogue key created outside the system.
This establishes the foundation for the next phase, which will involve consolidating these alerts into a single actionable GitHub issue to prevent alert fatigue.
Next Steps
Automate execution using GitHub Actions: Set up a scheduled workflow (.github/workflows/service-account-keys.yml) to run this compliance inventory continuously using a cron job. This will generate daily snapshots for historical queries and audits.
expected output
Mentor
@pabloem