Skip to content

Add Trusted Server audit command#800

Open
ChristianPavilonis wants to merge 15 commits into
feature/ts-cli-basefrom
feature/ts-cli-audit
Open

Add Trusted Server audit command#800
ChristianPavilonis wants to merge 15 commits into
feature/ts-cli-basefrom
feature/ts-cli-audit

Conversation

@ChristianPavilonis

Copy link
Copy Markdown
Collaborator

Summary

  • Adds ts audit as a Trusted Server-specific page audit command on top of the base CLI.
  • Collects rendered page/script evidence with Chrome/Chromium and writes draft audit/config artifacts.
  • Adds audit-specific docs, specs, tests, and host dependencies.

Changes

File Change
crates/trusted-server-cli/src/audit.rs Adds audit orchestration, output planning, artifact writing, and draft config generation.
crates/trusted-server-cli/src/audit/analyzer.rs Detects JS assets, first/third-party classification, redirects, titles, and known integrations.
crates/trusted-server-cli/src/audit/browser_collector.rs Adds headless Chrome/Chromium collection for DOM scripts and network requests.
crates/trusted-server-cli/src/audit/collector.rs Defines audit collection traits/data shapes for testability.
crates/trusted-server-cli/src/args.rs, src/run.rs, src/lib.rs Wires ts audit into CLI parsing and dispatch.
Cargo.toml, crates/trusted-server-cli/Cargo.toml, Cargo.lock Adds audit-only dependencies.
README.md, docs/guide/cli.md, docs/guide/getting-started.md Documents audit workflow and Chrome/Chromium prerequisite.
docs/superpowers/...audit... Adds audit design and implementation plan artifacts.

Closes

No issue provided; this PR is split out from the combined feature/ts-cli-next branch.

Test plan

  • cargo test --workspace
  • cargo clippy --workspace --all-targets --all-features -- -D warnings
  • cargo fmt --all -- --check
  • JS tests: cd crates/js/lib && npx vitest run
  • JS format: cd crates/js/lib && npm run format
  • Docs format: cd docs && npm run format
  • WASM build: cargo build --package trusted-server-adapter-fastly --release --target wasm32-wasip1
  • Manual testing via fastly compute serve
  • Other: cargo test --package trusted-server-cli --target aarch64-apple-darwin — 42 passed

Checklist

  • Changes follow CLAUDE.md conventions
  • No unwrap() in production code — use expect("should ...")
  • Uses tracing/log macros (not println!)
  • New code has tests
  • No secrets or credentials committed

Replace the custom trusted-server CLI lifecycle and config payload plumbing with a thin EdgeZero delegation layer using typed config push/validate flows.

Add TrustedServerAppConfig wrapper in core with deploy-time validation and move blob reconstruction into runtime helpers.

Drop flattened config entry publishing and route app-config through EdgeZero blob envelope handling while keeping edgezero flag reads in trusted_server_config.

Update CLI and architecture docs for the new model and adjust fastly adapter store selection.

@prk-Jr prk-Jr left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary

Adds ts audit: loads a public page in a fresh headless Chrome/Chromium session, inventories rendered JS assets, detects known integrations, and writes a JS-asset report plus a draft trusted-server.toml. Clean AuditCollector trait makes the analysis browser-free and well-tested. One blocking item: the new scraper dependency breaks the integration dependency-parity CI gate.

Blocking

🔧 wrench

  • Dependency-parity CI failurescraper = "0.24.0" added to the workspace conflicts with integration-tests' pinned scraper = "0.21"; the prepare integration artifacts job fails on markup5ever / match_token / scraper / selectors (plus uuid, web-sys, wasm-bindgen-futures). Align the versions or extend the allowlist — see inline on Cargo.toml.

Non-blocking

🤔 thinking

  • Inline substring integration detection is false-positive proneanalyzer.rs:217; auto-enables gpt/didomi/datadome in the draft.
  • First-party classifier over-matches parent/eTLD domainsanalyzer.rs:169.
  • No overall navigation timeoutbrowser_collector.rs:98; a hanging origin stalls ts audit.

♻️ refactor

  • report_error is a misleading no-op wrappererror.rs:7; overlaps cli_error.

⛏ nitpick

  • Asymmetric URL resolutionanalyzer.rs:68; relative src from collector script tags is silently dropped.

👍 praise

  • AuditCollector trait → browser-free unit tests via FakeCollector; strong testability.
  • Browser hygiene: fresh temp user-data-dir per run, close() always called, handler task aborted on close failure, no forced --no-sandbox.
  • parse_audit_url restricts to http/https (blocks file://, data:, chrome://), with a test.
  • Overwrite protection + a pre-collection conflict check, so a refusal doesn't even launch Chrome.
  • GTM container id constrained by GTM-[A-Z0-9]+ → no TOML injection from page content into the draft.
  • audit module and all host-only deps correctly gated behind cfg(not(target_arch = "wasm32")).

CI Status

  • prepare integration artifacts (dependency parity): FAIL (caused by this PR)
  • integration / edgezero / browser tests: SKIPPED (blocked by the failed prepare job)
  • fmt / clippy / cargo test / vitest: not run — those workflows trigger only on PRs targeting main, and this PR targets feature/ts-cli-base. Author reports cargo test --package trusted-server-cli passing (42 tests) locally.

Comment thread Cargo.toml
Comment thread crates/trusted-server-cli/src/audit/analyzer.rs Outdated
Comment thread crates/trusted-server-cli/src/audit/analyzer.rs
Comment thread crates/trusted-server-cli/src/audit/browser_collector.rs Outdated
Comment thread crates/trusted-server-cli/src/error.rs
Comment thread crates/trusted-server-cli/src/audit/analyzer.rs Outdated

@aram356 aram356 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary

ts audit is a well-structured, genuinely well-tested addition: a headless-Chrome page auditor that emits a js-assets.toml inventory and a draft trusted-server.toml. The AuditCollector trait makes the orchestration unit-testable without a browser, WASM gating is clean, and overwrite/--force safety is careful. One blocking issue: the new dependencies break CI. The remaining notes are heuristic-quality and design observations.

Local verification on aarch64-apple-darwin: cargo fmt --check, cargo clippy --all-targets -- -D warnings, and cargo test (37 passed) all pass for trusted-server-cli.

Blocking

🔧 wrench

  • PR breaks CI — integration-tests dependency drift: new scraper/chromiumoxide deps bumped the root Cargo.lock but the separate crates/trusted-server-integration-tests lockfile was not updated, failing the parity check. See inline comment on Cargo.toml:34.

Non-blocking

🤔 thinking

  • Public-suffix-unaware party classification (analyzer.rs:169-177): suffix matching treats example as first-party to publisher.example.
  • Loose substring integration detection (analyzer.rs:216-224audit.rs:282-291): incidental substrings like gpt can auto-enable a module in the draft config.
  • Bounded network capture (browser_collector.rs:147-165): resource-timing buffer cap (~250) and no HTTP status; large pages may drop assets.
  • Hard-fail on 4xx/5xx (browser_collector.rs:247-255): bot-protected pages (e.g. DataDome 403 to headless Chrome) yield no artifacts.

♻️ refactor

  • Dead method/status fields (collector.rs:27-33): always "GET"/None, never read.
  • Redundant script collection (analyzer.rs:49-95): parsed HTML and document.scripts are the same set, deduped by URL.

🌱 seedling

  • No timeout on browser.close() (browser_collector.rs:75-78): could hang on an unresponsive browser.
  • Heavy dependency tree: chromiumoxide pulls async-tungstenite/rustls/reqwest (+458 lockfile lines). Acceptable for a host-only operator CLI (no WASM/runtime impact); good that the fetcher feature is off so there's no Chromium auto-download.

⛏ nitpick

  • report_error/cli_error overlap (error.rs:7-9): both just .into(); report_error doesn't actually report.

📝 note

  • ~1.9k lines of design/plan docs added under docs/superpowers/. Scanned both files — only example/known-vendor domains, no secrets. Flagging the volume only.

👍 praise

  • The AuditCollector trait + FakeCollector seam and the thorough analyzer/orchestration tests (redirect warnings, dedup, relative-URL resolution, no-output guard, and collect-only-after-overwrite-check ordering) are excellent. Clean #[cfg(not(target_arch = "wasm32"))] gating throughout.

CI Status

  • prepare integration artifacts: FAIL (dependency drift — see blocking)
  • fmt (trusted-server-cli): PASS (local)
  • clippy (trusted-server-cli): PASS (local)
  • rust tests (trusted-server-cli): PASS (local, 37 passed)

Comment thread Cargo.toml
Comment thread crates/trusted-server-cli/src/audit/analyzer.rs
Comment thread crates/trusted-server-cli/src/audit/analyzer.rs
Comment thread crates/trusted-server-cli/src/audit/analyzer.rs Outdated
Comment thread crates/trusted-server-cli/src/audit/browser_collector.rs
Comment thread crates/trusted-server-cli/src/audit/browser_collector.rs
Comment thread crates/trusted-server-cli/src/audit/browser_collector.rs Outdated
Comment thread crates/trusted-server-cli/src/audit/collector.rs
Comment thread crates/trusted-server-cli/src/error.rs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants