Skip to content

Add Automatic Gitignore Support#22

Merged
technicalpickles merged 26 commits into
mainfrom
gitignore
Jun 23, 2026
Merged

Add Automatic Gitignore Support#22
technicalpickles merged 26 commits into
mainfrom
gitignore

Conversation

@technicalpickles

@technicalpickles technicalpickles commented Oct 2, 2025

Copy link
Copy Markdown
Contributor

Summary

Adds automatic .gitignore support to pks, respecting gitignored files during analysis. Enabled by default with option to disable for packwerk-identical behavior.

Motivation

Previously, pks analyzed all files regardless of .gitignore status, including build artifacts (node_modules/), temporary files (scratch/), and logs. This meant:

  • Unnecessary analysis of ignored files
  • Manual exclusion patterns needed
  • Behavior inconsistent with git-aware tools

What Changed

For users with .gitignore files:

  • Gitignored files are now automatically skipped
  • Performance improvement from pruning ignored directories early
  • No manual exclusion configuration needed

Configuration:

# packwerk.yml - disable to match packwerk behavior
respect_gitignore: false  # default: true

Supports:

  • Local .gitignore files
  • Global gitignore (~/.gitignore_global or core.excludesFile)
  • .git/info/exclude
  • All standard patterns (wildcards, negation, etc.)

Implementation

  • Uses ignore crate (same as ripgrep) for battle-tested performance
  • Prunes ignored directories during traversal (no stat calls)
  • Gracefully handles missing/malformed gitignore files
  • Works correctly in non-git directories

Testing

  • 10 integration tests covering core scenarios
  • 7 unit tests for helper functions
  • Test fixtures validate common patterns and edge cases
  • All existing tests pass (no regressions)

Troubleshooting

Files unexpectedly ignored?

# Check what's matching
git check-ignore -v path/to/file.rb

# Temporarily disable
echo "respect_gitignore: false" >> packwerk.yml

See ADVANCED_USAGE.md for detailed documentation.

Backward Compatibility

✅ Existing projects work without changes
✅ Can be disabled if needed
✅ Non-git directories continue working
⚠️ Behavior change: gitignored files now skipped (generally desirable)


Ready for review! This brings pks behavior in line with git-aware tooling while providing an escape hatch for compatibility.

@github-project-automation github-project-automation Bot moved this to Triage in Modularity Oct 2, 2025
technicalpickles added a commit that referenced this pull request Oct 2, 2025
The test modifies the HOME environment variable, which can cause race
conditions when tests run in parallel (especially in CI). Adding the
#[serial] attribute ensures this test runs serially and doesn't
conflict with other tests that also modify environment variables.

This follows the existing pattern in walk_directory.rs where tests
that modify HOME already use #[serial].

Fixes CI failure in PR #22
@technicalpickles technicalpickles marked this pull request as ready for review October 2, 2025 21:05
@technicalpickles technicalpickles requested review from Copilot and perryqh and removed request for perryqh October 2, 2025 21:06

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds automatic .gitignore support to pks, making it respect gitignored files during analysis by default. Users can disable this behavior to maintain packwerk-identical functionality.

  • Automatically excludes gitignored files from violation checking and file listing
  • Adds respect_gitignore configuration option (defaults to true)
  • Supports local, global, and git-specific gitignore files with full pattern support

Reviewed Changes

Copilot reviewed 22 out of 23 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
tests/gitignore_test.rs Comprehensive integration tests covering gitignore functionality scenarios
tests/fixtures/app_with_gitignore_disabled/* Test fixture demonstrating disabled gitignore behavior
tests/fixtures/app_with_gitignore/* Test fixture with gitignore patterns for testing
src/packs/walk_directory.rs Core gitignore implementation with matcher building and directory walking integration
src/packs/raw_configuration.rs Configuration schema extension for respect_gitignore option
src/packs.rs Module visibility change to expose walk_directory functions
README.md Documentation updates describing gitignore support
Cargo.toml Added ignore and tempfile dependencies
ADVANCED_USAGE.md Detailed gitignore configuration and troubleshooting guide

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment thread tests/gitignore_test.rs
Comment thread tests/gitignore_test.rs Outdated
Comment thread tests/gitignore_test.rs Outdated
Comment thread src/packs/walk_directory.rs Outdated
Comment thread src/packs/walk_directory.rs Outdated
@perryqh

perryqh commented Oct 2, 2025

Copy link
Copy Markdown
Contributor

Performance looks good!

Screenshot 2025-10-02 at 16 22 27

Comment thread src/packs/walk_directory.rs Outdated
Comment thread src/packs/walk_directory.rs
technicalpickles added a commit that referenced this pull request Nov 11, 2025
- Fix function signature formatting (opening brace on same line)
- Move std::fs import to top-level for better organization
- Add detailed documentation for expand_tilde's core.excludesFile purpose

All Copilot automated feedback addressed. Tests passing.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
technicalpickles added a commit that referenced this pull request Nov 11, 2025
The ignore crate version 0.4.25 requires Rust edition 2024, which is
not yet stabilized in Rust 1.83.0 (used in CI). This was causing CI
failures with the error:

  feature `edition2024` is required

By pinning to =0.4.23, we ensure consistent builds across environments
and prevent automatic upgrades to incompatible versions.

Fixes GitHub Actions failures in PR #22.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 40 out of 41 changed files in this pull request and generated 9 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/packs/walk_directory.rs Outdated
Comment thread tests/gitignore_test.rs Outdated
Comment thread tests/gitignore_test.rs
Comment thread ADVANCED_USAGE.md Outdated
Comment thread ADVANCED_USAGE.md Outdated
Comment thread ADVANCED_USAGE.md Outdated
Comment thread src/packs/walk_directory.rs Outdated
Comment thread src/packs/walk_directory.rs
Comment thread ADVANCED_USAGE.md Outdated
@dduugg dduugg requested a review from a team March 5, 2026 17:53

@dduugg dduugg left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review

Overall this is well-scoped and well-tested. A few observations:

Concerns

1. Breaking behavior change by default — The default of true means existing users silently get different results without any opt-in. Projects that previously analyzed files in gitignored paths (e.g. vendored code, generated files that someone actually wants checked) will get different output without knowing why. This is the right long-term default, but worth calling out clearly in release notes, and potentially adding a one-time warning on first run.

2. Subprocess on every runget_global_gitignore() spawns git config --global core.excludesFile via std::process::Command on every walk_directory call. Gracefully returns None when git isn't on PATH, which is correct. Worth noting that the ignore crate already knows how to locate the global gitignore internally — see concern #4.

3. Double Arc wrapping — The code wraps Option<Arc<Gitignore>> in another Arc, resulting in Arc<Option<Arc<Gitignore>>>. The inner Arc is unnecessary; Arc<Option<Gitignore>> suffices.

4. Missed opportunity: ignore::WalkBuilder — The ignore crate has a high-level WalkBuilder that handles gitignore-aware traversal natively (it's what ripgrep uses). The PR manually integrates the lower-level Gitignore type with jwalk, which works fine. If the project ever reconsiders its parallelism strategy, WalkBuilder would handle all of this automatically.

5. Pinned dependenciesglobset = "=0.4.16" and ignore = "=0.4.23" are pinned to avoid a Rust edition 2024 requirement in newer patches. Reasonable workaround, but these won't receive patch updates via cargo update. Worth a follow-up to upgrade to Rust edition 2024 and unpin.

Minor Notes

  • walk_directory visibility change (pub(crate)pub): Done to allow gitignore_test.rs to import build_gitignore_matcher. Fine, but slightly expands the public API surface.
  • per_file_cache.rs fix: The create_dir_all fix is correct and legitimate — just worth noting it's a bundled unrelated bug fix.
  • Test fixtures: Well-structured. The app_with_gitignore_disabled fixture correctly validates opt-out behavior.

None of these are blockers. The feature is solid, the ignore crate is battle-tested, and the test coverage is thorough. Main ask is to document the behavioral change prominently in release notes.

- Add ignore crate dependency (v0.4.23)
- Implement expand_tilde() for path expansion
- Implement get_global_gitignore() to find global gitignore files
- Implement build_gitignore_matcher() to create gitignore matchers
- Add comprehensive unit tests for all three functions
- Add integration tests for gitignore matching
- Create test fixture: tests/fixtures/app_with_gitignore/

This is Phase 1 of gitignore support - infrastructure only.
Functions are implemented but not yet integrated into walk_directory().
Next phase will integrate these into actual file walking logic.
Based on test coverage analysis, made the following improvements:

Integration Tests (CLI-based):
- Converted tests to use assert_cmd with CLI invocation pattern
- Added common::teardown() calls to all integration tests
- Added test for violations in gitignored files (ready for Phase 2)
- Added test for list-included-files with gitignore
- Added test to verify non-gitignore repos still work
- Tests include TODOs for assertions to uncomment after Phase 2

Unit Tests:
- Added test for expand_tilde() when HOME is unset
- Added test for malformed gitignore handling
- Added test for .git/info/exclude support
- Used serial_test to prevent race conditions in env var tests

Test Fixture Enhancements:
- Added packs/bar with actual service code
- Added packs/foo with violation that should be caught
- Added ignored_folder/violating.rb with violation that should be ignored
- Fixture now has realistic violation scenarios for Phase 2 testing

All 177 unit tests + 5 integration tests pass.
No clippy warnings.
Implement automatic gitignore support with configuration toggle:

Configuration:
- Add respect_gitignore option to RawConfiguration (default: true)
- Can be disabled in packwerk.yml/packs.yml

Integration:
- Build gitignore matcher at start of walk_directory()
- Prune ignored directories in process_read_dir callback (early exit)
- Skip ignored files in main processing loop
- Gracefully handle matcher build failures

Implementation Details:
- Use Arc<Option<Arc<Gitignore>>> for thread-safe sharing
- Check gitignore BEFORE explicit exclusions (precedence)
- Support local .gitignore, global gitignore, and .git/info/exclude
- Log debug message if gitignore build fails

Test Updates:
- Uncommented Phase 2 assertions in integration tests
- All tests pass: 177 unit tests + 5 integration tests
- Tests verify:
  * Violations in ignored files are NOT detected ✅
  * Violations in non-ignored files ARE detected ✅
  * list-included-files excludes ignored files ✅
  * Works without .gitignore files ✅

Performance: No regressions observed - gitignore checking is fast
and early directory pruning may actually improve performance.
Add comprehensive tests for missing coverage identified in gap analysis:

New Tests (5):
1. test_respect_gitignore_can_be_disabled ✅
   - Tests respect_gitignore: false configuration
   - THE critical test - validates main config option works

2. test_gitignore_negation_patterns ✅
   - Tests !pattern negation at library level
   - Common use case: *.log but !important.log

3. test_list_included_files_respects_negation ✅
   - Verifies negation works with list-included-files

4. test_respects_global_gitignore ✅
   - End-to-end test with ~/.gitignore_global
   - Creates temp HOME with global gitignore

5. test_update_respects_gitignore ✅
   - Tests gitignore works with update command
   - Uses tempfile for safe temporary fixture

New Fixtures:
- tests/fixtures/app_with_gitignore_disabled/
  * Has .gitignore but respect_gitignore: false in packwerk.yml
  * Contains violation in ignored_folder/ that SHOULD be detected

Enhanced Fixtures:
- tests/fixtures/app_with_gitignore/
  * Added !important.log negation pattern to .gitignore
  * Added important.log and regular.log test files
  * Added tmp/cache directory structure

Dependencies:
- Added tempfile = "3.8.0" to dev-dependencies

Test Results:
- All 10 gitignore tests passing ✅
- All 177 unit tests + 60+ integration tests passing ✅
- No clippy warnings ✅

Coverage Improvements:
- Before: 65-70% coverage with critical gaps
- After: 90%+ coverage with all high-priority gaps addressed

What's Now Tested:
✅ respect_gitignore: false (CRITICAL - was missing!)
✅ Negation patterns (!pattern)
✅ Global gitignore end-to-end
✅ Multiple commands (check, list-included-files, update)
✅ All original functionality still works

Addresses issues identified in phase2-test-gap-analysis.md
- Add gitignore feature highlight to README
- Add detailed gitignore section to README
- Expand ADVANCED_USAGE.md with comprehensive gitignore docs
- Document respect_gitignore configuration option
- Explain precedence rules with examples
- Add troubleshooting section with debugging commands
- Document packwerk migration path

Completes Phase 5 of gitignore implementation.
All implementation phases (1-5) are now complete.
- Remove trailing whitespace from doc comments
- Fix line length and formatting for function signatures
- Apply standard Rust formatting to walk_directory.rs and gitignore_test.rs
The test modifies the HOME environment variable, which can cause race
conditions when tests run in parallel (especially in CI). Adding the
#[serial] attribute ensures this test runs serially and doesn't
conflict with other tests that also modify environment variables.

This follows the existing pattern in walk_directory.rs where tests
that modify HOME already use #[serial].

Fixes CI failure in PR #22
The file tests/fixtures/app_with_gitignore_disabled/ignored_folder/violating.rb
was not tracked by git because the fixture's .gitignore contains 'ignored_folder/'.

This file needs to exist in CI for the test_respect_gitignore_can_be_disabled
test to work correctly. The test verifies that when respect_gitignore: false,
files in ignored folders ARE processed and violations ARE detected.

Without this file, the test fails in CI with 'No violations detected!' because
the file doesn't exist after checkout.

Used 'git add -f' to force-add this intentionally-ignored test fixture file.
technicalpickles and others added 12 commits June 22, 2026 11:12
- Fix redundant 'ref' keyword in Option pattern matching (lines 256, 310)
- Refactor get_global_gitignore() to only use core.excludesFile git config
  - Remove hardcoded fallback paths (~/.gitignore_global, ~/.config/git/ignore)
  - Simplify function from ~40 lines to ~20 lines
  - Follow git standards rather than conventions
- Update test_respects_global_gitignore to use git config
  - Properly save/restore original core.excludesFile value
  - Use --unset if config wasn't previously set
- Update documentation to reflect core.excludesFile-only approach
  - README.md: Remove references to fallback paths
  - ADVANCED_USAGE.md: Update examples and troubleshooting

All tests pass (177 unit + 71 integration tests)
- Fix function signature formatting (opening brace on same line)
- Move std::fs import to top-level for better organization
- Add detailed documentation for expand_tilde's core.excludesFile purpose

All Copilot automated feedback addressed. Tests passing.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Performance optimization suggested by @perryqh in code review:
- Only check gitignore for directories during traversal (line ~253)
- Setting read_children_path = None on files is a no-op
- Files are already checked in main loop (line ~304)

Benefits:
- Eliminates redundant gitignore.matched() calls on files
- Clarifies intent: directory pruning vs file filtering
- No functional change - all tests pass

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Ensure parent directories exist before writing cache files. This fixes a test failure where cache writes would fail if the cache directory structure didn't already exist.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
The ignore crate version 0.4.25 requires Rust edition 2024, which is
not yet stabilized in Rust 1.83.0 (used in CI). This was causing CI
failures with the error:

  feature `edition2024` is required

By pinning to =0.4.23, we ensure consistent builds across environments
and prevent automatic upgrades to incompatible versions.

Fixes GitHub Actions failures in PR #22.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Similar to the ignore crate, globset version 0.4.18 requires Rust
edition 2024, which is not yet stabilized in Rust 1.83.0 (used in CI).

This was causing CI failures after fixing the ignore crate issue.

By pinning to =0.4.16, we ensure consistent builds across environments.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
The assert_cmd crate deprecated the `Command::cargo_bin()` method in
favor of the `cargo_bin!` macro. This change updates all test files to
use the new macro pattern: `Command::new(cargo_bin!("pks"))`.

This fixes the clippy errors in CI that were treating deprecation
warnings as errors with `-D warnings`.

Changes:
- Added `use assert_cmd::cargo::cargo_bin;` import to test files
- Replaced `Command::cargo_bin("pks")` with `Command::new(cargo_bin!("pks"))`
- Removed unnecessary `.unwrap()` calls that are no longer needed

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Cargo fmt requires imports to be in alphabetical order.
The cargo_bin function is deprecated, so we can't import it.
Instead, use the fully qualified macro path: assert_cmd::cargo::cargo_bin!

This avoids the deprecation warning while still using the recommended macro.
technicalpickles and others added 5 commits June 22, 2026 11:28
…guard

- Fix ADVANCED_USAGE.md precedence order: gitignore runs before include/exclude
  patterns (not after), remove all "override gitignore" language, fix the
  troubleshooting step 5 (use .gitignore negation, not include patterns), and
  fix the YAML comment from "highest priority - override gitignore" to "what
  file extensions to analyze"
- Remove Arc double-wrapping in walk_directory: gitignore_matcher was
  Some(Arc::new(matcher)) producing Arc<Option<Arc<Gitignore>>>; change to
  Some(matcher) so gitignore_ref is Arc<Option<Gitignore>>
- Fix brittle line-number comment: "see line ~304" -> "main file processing loop"
- Add missing fixture file app_with_gitignore/ignored_folder/violating.rb so
  test_check_ignores_violations_in_gitignored_files passes non-vacuously
- Add GitConfigGuard RAII struct with Drop impl to test_respects_global_gitignore
  so git config is restored even when assertions panic

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Toolchain is now 1.92.0, which supports Rust edition 2024. The pins
were blocking patch updates and are no longer needed -- both globset
0.4.18 and ignore 0.4.26 compile and all tests pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Concern 1: Add CHANGELOG.md documenting that respect_gitignore now
defaults to true, with opt-out instructions (respect_gitignore: false).

Concerns 2+4: Replace the get_global_gitignore() subprocess
(std::process::Command::new("git")) with ignore::gitignore::gitconfig_excludes_path(),
which reads ~/.gitconfig and $XDG_CONFIG_HOME/git/config directly.
This is already in the ignore crate we depend on, so no new dependency.

Remove expand_tilde() and its tests — tilde expansion is now handled
internally by the ignore crate.

Add a comment explaining why we kept jwalk + manual gitignore integration
instead of switching to ignore::WalkBuilder: WalkBuilder's visitor pattern
has no equivalent to jwalk's ProcessReadDirState, which we use to propagate
current_package_yml state in O(1) per directory.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…igin/gitignore-deps-research' into gitignore
@technicalpickles

Copy link
Copy Markdown
Contributor Author

Thanks for the thorough review @dduugg! All five concerns addressed:

1. Breaking change documentation — Added CHANGELOG.md documenting the behavioral change, who's affected, and the opt-out (respect_gitignore: false in packwerk.yml). Skipped the one-time runtime warning — the changelog entry covers it without adding runtime noise.

2+4. Subprocess and WalkBuilderget_global_gitignore() no longer spawns a subprocess. It now calls ignore::gitignore::gitconfig_excludes_path() directly, which reads ~/.gitconfig (and $XDG_CONFIG_HOME/git/config) without invoking git.

Regarding WalkBuilder: evaluated it, but it can't replace jwalk here. pks uses a custom ProcessReadDirState with jwalk's ClientState trait to propagate current_package_yml from parent to child directories in a single O(1) pass per entry. WalkBuilder's visitor pattern has no equivalent hook for carrying per-directory state across levels. Added a code comment in walk_directory.rs explaining the tradeoff so the next person doesn't have to re-derive it.

3. Double Arc wrapping — Fixed. build_gitignore_matcher() now returns Option<Gitignore> (no inner Arc), and the call site wraps it once in Arc::new() where needed.

5. Pinned dependencies — Unpinned. rust-toolchain.toml is already at 1.92.0 which supports edition 2024, so ignore and globset are now on semver ranges ("0.4") and will pick up patch updates normally.


Smoke-tested against zenpayroll (46k files): zero behavioral delta with respect_gitignore: true vs false on the real codebase. The packwerk.yml exclude block already covers the same directories gitignore would filter. Feature verified correct with a synthetic coverage/fake_service.rb test file.

@technicalpickles technicalpickles merged commit 2fe98b7 into main Jun 23, 2026
7 checks passed
@technicalpickles technicalpickles deleted the gitignore branch June 23, 2026 14:28
@github-project-automation github-project-automation Bot moved this from Triage to Done in Modularity Jun 23, 2026
@technicalpickles technicalpickles mentioned this pull request Jun 23, 2026
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

4 participants