chore: automate releases#3833
Conversation
Overhauls the release pipeline into a fully automated, robust, and modular architecture: - Created utils.py, git.py, and gh.py to encapsulate shell executions, git helpers, and GitHub CLI wrappers under clean, prefix-free module namespaces. - Overhauled release.py to use these modules, return clean exit codes, and support dynamic RC tagging and checklist gating. - Overhauled cmd_prepare to make the preparation pipeline fully unconditional (cuts branch, commits, pushes, opens PR, and creates/updates tracking issue). - Integrated news processing directly into cmd_process_backports to merge and amend changelogs per backport. - Created GitHub Actions workflows to automate all release phases (prepare, branch cut, backports, RC tagging, and promotion). - Added comprehensive unit test suites in release_test.py mocking the new module-level namespace helpers.
There was a problem hiding this comment.
Code Review
This pull request introduces a code-centric, reactive release automation architecture for rules_python by splitting the release logic into dedicated helper modules (gh.py, git.py, utils.py) and expanding release.py with several subcommands to handle preparation, branch cutting, backport processing, and release candidate tagging. Feedback on the changes highlights several critical issues, including the need to use git status --porcelain and .splitlines() to properly parse git output line-by-line rather than character-by-character. Additionally, the reviewer noted a Python compatibility issue with Exception.add_note on older Python versions, unsafe dictionary access on PR merge commits, potential failures when staging a non-existent news/ directory, loose regex patterns for parsing versions from issue titles, and potential regex failures due to Windows-style line endings.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| def status(): | ||
| """Returns the output of git status.""" | ||
| return run_cmd("git", "status") |
There was a problem hiding this comment.
Running git status returns a verbose, human-readable string even when the workspace is completely clean (e.g., "nothing to commit, working tree clean"). This causes the dirty check in cmd_prepare to always fail and report that local edits were detected. Using git status --porcelain is machine-readable and returns an empty string when the workspace is clean.
| def status(): | |
| """Returns the output of git status.""" | |
| return run_cmd("git", "status") | |
| def status(): | |
| """Returns the output of git status --porcelain.""" | |
| return run_cmd("git", "status", "--porcelain") |
| status = git.status() | ||
| if status: | ||
| print( | ||
| "Error: Local edits detected. Workspace must be completely clean" | ||
| " before running release preparation." | ||
| ) | ||
| for line in status: | ||
| print(f" {line}") | ||
| return 1 |
There was a problem hiding this comment.
Since git.status() returns a single string containing the command output, iterating over status directly will iterate character-by-character rather than line-by-line. This causes the error message to print single characters on separate lines. Use .splitlines() to iterate over the lines of the status output.
| status = git.status() | |
| if status: | |
| print( | |
| "Error: Local edits detected. Workspace must be completely clean" | |
| " before running release preparation." | |
| ) | |
| for line in status: | |
| print(f" {line}") | |
| return 1 | |
| status = git.status() | |
| if status: | |
| print( | |
| "Error: Local edits detected. Workspace must be completely clean" | |
| " before running release preparation." | |
| ) | |
| for line in status.splitlines(): | |
| print(f" {line}") | |
| return 1 |
| for line in modified_files: | ||
| file_path = line.strip().split()[-1] | ||
| git.add(file_path) |
There was a problem hiding this comment.
Since modified_files is a single string containing the command output, iterating over it directly will iterate character-by-character. This causes line.strip().split()[-1] to raise an IndexError or return single characters, resulting in incorrect git.add calls. Use .splitlines() to iterate over the lines of the status output.
| for line in modified_files: | |
| file_path = line.strip().split()[-1] | |
| git.add(file_path) | |
| for line in modified_files.splitlines(): | |
| file_path = line.strip().split()[-1] | |
| git.add(file_path) |
| else: | ||
| item["commit"] = pr_info["mergeCommit"]["oid"] |
There was a problem hiding this comment.
Directly accessing pr_info["mergeCommit"]["oid"] is unsafe because mergeCommit can be null or missing from the JSON response in certain edge cases. This would raise a TypeError and crash the script. Use defensive dictionary access to handle missing or null mergeCommit fields gracefully.
| else: | |
| item["commit"] = pr_info["mergeCommit"]["oid"] | |
| else: | |
| merge_commit = pr_info.get("mergeCommit") | |
| if merge_commit and "oid" in merge_commit: | |
| item["commit"] = merge_commit["oid"] | |
| else: | |
| print(f"PR #{pr_num} has no merge commit SHA. Gating.") | |
| item["status"] = "unmerged-pr" |
| issue_title = gh.get_issue_title(args.issue) | ||
| version_match = re.search(r"[0-9]+\.[0-9]+\.[0-9]+(-rc[0-9]+)?", issue_title) | ||
| if not version_match: | ||
| print(f"Error: Could not parse version from issue title: {issue_title}") | ||
| return 1 | ||
|
|
||
| version = version_match.group(0).replace("v", "") |
There was a problem hiding this comment.
The current regex [0-9]+\.[0-9]+\.[0-9]+(-rc[0-9]+)? is loose and can match any random numbers in the issue title. It is safer and more robust to specifically match the Release <version> pattern and extract the version using a capture group, which also avoids the redundant .replace("v", "") call.
| issue_title = gh.get_issue_title(args.issue) | |
| version_match = re.search(r"[0-9]+\.[0-9]+\.[0-9]+(-rc[0-9]+)?", issue_title) | |
| if not version_match: | |
| print(f"Error: Could not parse version from issue title: {issue_title}") | |
| return 1 | |
| version = version_match.group(0).replace("v", "") | |
| issue_title = gh.get_issue_title(args.issue) | |
| version_match = re.search(r"Release\s+v?(\d+\.\d+\.\d+(?:-rc\d+)?)", issue_title, re.IGNORECASE) | |
| if not version_match: | |
| print(f"Error: Could not parse version from issue title: {issue_title}") | |
| return 1 | |
| version = version_match.group(1) |
| issue_title = gh.get_issue_title(args.issue) | ||
| version_match = re.search(r"[0-9]+\.[0-9]+\.[0-9]+(-rc[0-9]+)?", issue_title) | ||
| if not version_match: | ||
| print(f"Error: Could not parse version from issue title: {issue_title}") | ||
| return 1 | ||
|
|
||
| version = version_match.group(0).replace("v", "") |
There was a problem hiding this comment.
The current regex [0-9]+\.[0-9]+\.[0-9]+(-rc[0-9]+)? is loose and can match any random numbers in the issue title. It is safer and more robust to specifically match the Release <version> pattern and extract the version using a capture group, which also avoids the redundant .replace("v", "") call.
| issue_title = gh.get_issue_title(args.issue) | |
| version_match = re.search(r"[0-9]+\.[0-9]+\.[0-9]+(-rc[0-9]+)?", issue_title) | |
| if not version_match: | |
| print(f"Error: Could not parse version from issue title: {issue_title}") | |
| return 1 | |
| version = version_match.group(0).replace("v", "") | |
| issue_title = gh.get_issue_title(args.issue) | |
| version_match = re.search(r"Release\s+v?(\d+\.\d+\.\d+(?:-rc\d+)?)", issue_title, re.IGNORECASE) | |
| if not version_match: | |
| print(f"Error: Could not parse version from issue title: {issue_title}") | |
| return 1 | |
| version = version_match.group(1) |
| issue_title = gh.get_issue_title(args.issue) | ||
| version_match = re.search(r"[0-9]+\.[0-9]+\.[0-9]+(-rc[0-9]+)?", issue_title) | ||
| if not version_match: | ||
| print(f"Error: Could not parse version from issue title: {issue_title}") | ||
| return 1 | ||
|
|
||
| version = version_match.group(0).replace("v", "") |
There was a problem hiding this comment.
The current regex [0-9]+\.[0-9]+\.[0-9]+(-rc[0-9]+)? is loose and can match any random numbers in the issue title. It is safer and more robust to specifically match the Release <version> pattern and extract the version using a capture group, which also avoids the redundant .replace("v", "") call.
| issue_title = gh.get_issue_title(args.issue) | |
| version_match = re.search(r"[0-9]+\.[0-9]+\.[0-9]+(-rc[0-9]+)?", issue_title) | |
| if not version_match: | |
| print(f"Error: Could not parse version from issue title: {issue_title}") | |
| return 1 | |
| version = version_match.group(0).replace("v", "") | |
| issue_title = gh.get_issue_title(args.issue) | |
| version_match = re.search(r"Release\s+v?(\d+\.\d+\.\d+(?:-rc\d+)?)", issue_title, re.IGNORECASE) | |
| if not version_match: | |
| print(f"Error: Could not parse version from issue title: {issue_title}") | |
| return 1 | |
| version = version_match.group(1) |
This introduces workflows and tools to automate releases. The basic
idea it to have an issue with tasks, and workflows invoke tools that
parse and update the tasks in the issue.
The basic release steps are:
Each step has a workflow. For now, most of the workflows require a
manual trigger, to help prevent workflows from running wild as we
flush out bugs and edge cases.
The workflows almost entirely delegate to the
releasetool. There'sbasically one command per workflow. This makes it easy to encode
specialized logic and run it locally for testing or reproduction.