Skip to content

[GSOC 2026] Adding logic to validate and generate an error if there are key…#38992

Open
HansMarcus01 wants to merge 3 commits into
apache:masterfrom
HansMarcus01:feat-keys-verification
Open

[GSOC 2026] Adding logic to validate and generate an error if there are key…#38992
HansMarcus01 wants to merge 3 commits into
apache:masterfrom
HansMarcus01:feat-keys-verification

Conversation

@HansMarcus01

@HansMarcus01 HansMarcus01 commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

This Pull Request introduces the core detection logic to identify rogue service account keys in the apache-beam-testing infrastructure.

Issue: GSOC-273

Project Details

Currently, it is difficult to determine if a service account key was created manually outside the automated rotation system. To solve this, I have modify a compliance checker script (account_keys.py) that performs the following actions:

  • State Generation (--action generate): Connects to the GCP IAM API and Secret Manager to document the current live service accounts and managed secrets, updating the local state file (keys.yaml).

  • Compliance Detection (--action check): Uses set logic to compare the physical keys existing in IAM against the legal, managed keys registered in Secret Manager.

  • Security Alerting: Successfully identifies any rogue key created outside the system.

This establishes the foundation for the next phase, which will involve consolidating these alerts into a single actionable GitHub issue to prevent alert fatigue.

Next Steps

  • Automate execution using GitHub Actions: Set up a scheduled workflow (.github/workflows/service-account-keys.yml) to run this compliance inventory continuously using a cron job. This will generate daily snapshots for historical queries and audits.

    • Monitor Rotation Workflows and CI Pipelines: Implement observability for existing GitHub Actions to verify that automated key rotations successfully generated new keys, published them to Secret Manager, and correctly updated the expected environment variables or artifacts.

expected output

image

Mentor

@pabloem

…s linked to a service account that was not rotated by our system
@gemini-code-assist

Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the compliance checker script by introducing automated detection of unauthorized service account keys. By validating active IAM keys against the registry of managed secrets, the system can now proactively identify and flag rogue keys, providing a critical security layer for the infrastructure.

Highlights

  • Rogue Key Detection: Implemented logic to identify and alert on service account keys created outside the automated rotation system by comparing IAM keys against Secret Manager records.
  • IAM and Secret Manager Integration: Added helper methods to fetch user-managed keys from GCP IAM and authorized legal keys from Secret Manager.
  • Error Handling Improvement: Updated the main execution block to use logger.exception for better visibility into failures during script execution.
New Features

🧠 You can now enable Memory (public preview) to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a mechanism to detect rogue service account keys by comparing user-managed keys in IAM against legal keys stored in Secret Manager. It adds helper methods to retrieve keys from both sources and integrates this check into the compliance workflow. Additionally, exception logging in the main entry point is improved to include stack traces. A critical bug was identified in _get_legal_keys_from_secret_manager where the account_email parameter is unused, and self.secret_id is incorrectly used instead, causing the same secret to be queried for all service accounts. A code suggestion is provided to fix this by using the correct secret name parameter.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread infra/enforcement/account_keys.py Outdated
Comment on lines +130 to +154
def _get_legal_keys_from_secret_manager(self, account_email: str) -> List[str]:
"""
Retrieves the list of legal keys for a given service account from Secret Manager.

Args:
account_email (str): The email of the service account to retrieve keys for.

Returns:
List[str]: A list of key IDs for the legal keys associated with the service account.
"""
legal_keys = []
parent = self.secret_client.secret_path(self.project_id, self.secret_id)

try:
versions = self.secret_client.list_secret_versions(request={"parent": parent})
for version in versions:
if version.state.name == secretmanager.SecretVersion.State.ENABLED:
response = self.secret_client.access_secret_version(request={"name": version.name})
data_str = response.payload.data.decode("UTF-8")
key_id = data_str.split(":",1)[0]
legal_keys.append(key_id)
return legal_keys
except Exception as e:
self.logger.error(f"Failed to retrieve legal keys from Secret Manager for account '{account_email}': {e}")
return []

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

In _get_legal_keys_from_secret_manager, the parameter account_email is defined but never used. Instead, self.secret_id is used to construct the secret path. This is a critical bug because it will query the same secret (whatever self.secret_id is set to) for every service account, rather than querying the specific secret for each service account (e.g., secret_name which is passed to this method).

To fix this, we should rename the parameter to secret_name to match its usage and use it when constructing the secret path.

    def _get_legal_keys_from_secret_manager(self, secret_name: str) -> List[str]:
        """
        Retrieves the list of legal keys for a given service account from Secret Manager.

        Args:
            secret_name (str): The name of the secret to retrieve keys for.

        Returns:
            List[str]: A list of key IDs for the legal keys associated with the service account.
        """
        legal_keys = []
        parent = self.secret_client.secret_path(self.project_id, secret_name)

        try:
            versions = self.secret_client.list_secret_versions(request={"parent": parent})
            for version in versions:
                if version.state.name == secretmanager.SecretVersion.State.ENABLED:
                    response = self.secret_client.access_secret_version(request={"name": version.name})
                    data_str = response.payload.data.decode("UTF-8")
                    key_id = data_str.split(":",1)[0]
                    legal_keys.append(key_id)
            return legal_keys
        except Exception as e:
            self.logger.error(f"Failed to retrieve legal keys from Secret Manager for secret '{secret_name}': {e}")
            return []

@github-actions

Copy link
Copy Markdown
Contributor

Checks are failing. Will not request review until checks are succeeding. If you'd like to override that behavior, comment assign set of reviewers

@pabloem pabloem left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make sure to reference the Beam issue we are trying to fix (what number is it?)
also, show an example of the output from running this, and what are next steps after this PR.

Comment thread infra/enforcement/account_keys.py Outdated

def _get_legal_keys_from_secret_manager(self, secret_name: str) -> List[str]:
"""
Retrieves the list of legal keys for a given service account from Secret Manager.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does legal mean here? Can you define 'lega' or improve the naming?

Comment thread infra/enforcement/account_keys.py Outdated
legal_keys = self._get_legal_keys_from_secret_manager(secret_name)
rogue_keys = set(iam_keys) - set(legal_keys)
for rogue_key in rogue_keys:
msg = f"SECURITY ALERT: Rogue key '{rogue_key}' detected on account '{service_account}'. this key was created outside the rotation system."

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's not call it 'rogue' - let's call it 'unmanaged'. and instead of 'rotation system', let's call it "Beam's service account management system (in infra/keys/)".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants