Skip to content

docs: MkDocs Material site — review-ready v1 (Diátaxis, tutorial, how-tos, CI gates)#36

Merged
rustyconover merged 13 commits into
mainfrom
docs-mkdocs-site
Jun 15, 2026
Merged

docs: MkDocs Material site — review-ready v1 (Diátaxis, tutorial, how-tos, CI gates)#36
rustyconover merged 13 commits into
mainfrom
docs-mkdocs-site

Conversation

@rustyconover

@rustyconover rustyconover commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

Reworks the vgi-python documentation into a review-ready v1 built around one job: get a developer building and running a real worker fast. Structure follows Diátaxis (Tutorial / How-to / Concepts / API Reference). Acceptance criteria, a reviewer rubric, and a usability-test protocol are included for sign-off.

Highlights

  • Tutorial (the headline path): a short overview → step 1 scalar (double) → step 2 table (series). Leads with trivial arithmetic; the table step is stateless, with streaming-via-state introduced afterward. Haybarn-primary, stock-DuckDB in a tab. Realistically ≤20 min to a scalar + table function callable from SQL.
  • How-to guides: function patterns (all four), expose a catalog, persist state across workers, serve over HTTP with auth, integrate with the optimizer (pushdown + stats). Each follows a 4-point per-page standard (what+who · prerequisites · runnable example · next steps).
  • Concepts: process model, transports, Arrow data model, call lifecycle, parallel workers.
  • API Reference: auto-generated via mkdocstrings (unchanged from the scaffolding).
  • Verified examples: examples/*.py workers (scalar, table, stateful-streaming table, table-in-out, aggregate, string scalar) are CI-tested — scalar/table/table-in-out exercised over the wire; aggregate phases exercised directly.

CI gates (new docs job)

  • mkdocs build --strict — zero warnings.
  • pytest tests/test_documentation_examples.py tests/test_examples_workers.py — 65 doc-example assertions + worker e2e (snippet-embedded files tested directly).
  • lychee offline link-check (blocking).
  • Vale prose lint (non-blocking until the vocabulary is tuned against a real run — one-line flip in ci.yml).

Copy

  • Hero rewritten benefit-first.
  • Dated "Python, TypeScript, Go" language list → "Any language with an Apache Arrow library" (homepage + README).

Sign-off artifacts (repo root)

  • DOCS_ACCEPTANCE_CRITERIA.md, DOCS_REVIEW_RUBRIC.md, DOCS_USABILITY_TEST.md.

Remaining before launch (human gates)

  • Fresh-dev usability test (DOCS_USABILITY_TEST.md) — ≤20 min, unaided.
  • Senior DX reviewer rubric (DOCS_REVIEW_RUBRIC.md) signed off.
  • Tune Vale vocab on first CI run, then flip Vale to blocking.

Deploy prerequisites (one-time, outside this repo)

  • GitHub secrets CLOUDFLARE_API_TOKEN + CLOUDFLARE_ACCOUNT_ID.
  • Cloudflare Pages project vgi-python-docs + custom domain vgi-python.query.farm.

🤖 Generated with Claude Code

rustyconover and others added 8 commits June 15, 2026 14:36
Port the docs pipeline from vgi-rpc: MkDocs + Material theme, mkdocstrings
auto-generated API reference (Griffe), D2/Mermaid diagrams, Query.Farm
palette, SEO/OpenGraph overrides, and a GitHub Actions workflow that builds
(`mkdocs build --strict`) and deploys to Cloudflare Pages.

- pyproject: add `docs` dependency group; point Documentation URL at
  https://vgi-python.query.farm/
- mkdocs.yml: VGI site metadata, full theme/extensions/plugins, nav over a
  new homepage, 14 API pages, and the 11 existing guides
- docs/api/*.md: module-level `:::` directives across functions, arguments,
  worker, client, catalogs, storage, metadata, filters, auth, observability,
  http, transactor, exceptions
- docs/index.md: homepage distilled from the README
- theme assets: overrides/main.html (VGI meta), stylesheets/extra.css,
  robots.txt, and logo/favicon/social-card derived from docs/vgi-logo.png
- .github/workflows/docs.yml: build + deploy to Pages project vgi-python-docs
- fix two docstring rendering bugs in vgi/metadata.py where interval bounds
  `[1,2][2,3]` parsed as Markdown reference links (broke --strict)
- gitignore: /site/ and /.cache/

Strict build passes with zero warnings.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…kers

Restructure docs into Diátaxis (Tutorial / How-to / Concepts / API Reference)
and deliver the headline tutorial path plus the function-patterns how-to.

- Split the long tutorial into a short overview + step 1 (scalar) + step 2
  (table), each with its own runnable worker.
- examples/: greeting_scalar_worker.py, greeting_worker.py, filter_worker.py,
  sum_worker.py — all four function patterns. scalar/table/table-in-out
  verified end-to-end over the wire via the Python Client; aggregate verified
  to import, register, and accumulate correctly.
- docs/how-to/function-patterns.md: one recipe covering all four patterns,
  embedding the verified workers via pymdownx snippets.
- docs/contributing-docs.md: the per-page orientation standard + page template.
- mkdocs.yml: four-section Diátaxis nav; how-to/concepts landing pages.
- Note: the README/tutorial table-function pattern was under-specified
  (missing @bind_fixed_schema/@init_single_worker, typed args dataclass, and
  ArrowSerializableDataclass state); the new examples use the served-correct
  form.

Strict build passes; examples lint clean and serve.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…, prose, links)

- tests/test_examples_workers.py: imports every examples/*.py and serves the
  scalar / table / table-in-out workers over the wire, plus exercises the
  aggregate phases. This is the source of truth for snippet-embedded examples
  (find_examples can't run them).
- tests/test_documentation_examples.py: add the Diátaxis trees to DOC_FILES and
  skip pymdownx snippet directives (--8<--) so embedded files aren't double-run.
- .github/workflows/ci.yml: new `docs` job — mkdocs build --strict, run the doc
  example tests, lychee offline link-check (blocking), and Vale prose lint
  (non-blocking until the vocab is tuned against a real run).
- .vale.ini + .github/styles/config/vocabularies/VGI/accept.txt: Vale config.

60 doc-example assertions pass; 8 worker e2e tests pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ne notes

- DOCS_REVIEW_RUBRIC.md: senior DX sign-off checklist (orientation, fast path,
  completeness, correctness, scannability, navigation, automated gates).
- DOCS_USABILITY_TEST.md: scripted ≤20-min fresh-dev test (scalar + table from
  DuckDB) with milestone timing and a stumble log.
- api/transactor.md, api/observability.md: "Advanced — reference only" notes so
  out-of-scope topics don't read as v1 guides.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Concise catalogs/ATTACH recipe anchored on the verified greeting_worker
catalog (functions-in-a-catalog, the common case), with View/Table explained
and linked to the Catalog Interface reference for full options.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…age list

Tutorial & examples (per review feedback — strings were a confusing first
example):
- New examples/calc_scalar_worker.py (double) and examples/calc_worker.py
  (double + series), verified over the wire (double([21,5])->[42,10];
  series(3)->[0,1,2]). Replaces greeting_worker.py as the tutorial worker.
- Tutorial steps 1/2, function-patterns, catalogs how-to, and the homepage
  "See it in action" now lead with the trivial numeric `double` scalar; the
  greeting/string scalar is retained as a secondary example in
  function-patterns (greeting_scalar_worker.py).
- test_examples_workers.py: calc scalar+table e2e tests + string-scalar test.

Homepage copy:
- Hero rewritten benefit-first: "Extend DuckDB in pure Python…".
- "Any language; Python, TypeScript, Go today" -> "Any language with an Apache
  Arrow library" (homepage + README) so the list doesn't go stale.

Strict build green; 63 doc-example assertions + worker e2e pass; lint clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ate after

Per review feedback — the tutorial's table step jumped into serializable state,
decorators, and args all at once. Now:

- examples/calc_worker.py `series` is stateless: emits all rows in one process()
  call, then finishes. No state class, no ArrowSerializableDataclass.
- examples/series_streaming_worker.py: the stateful, chunked variant that keeps a
  cursor in state — introduced as a "Streaming with state" section in the
  function-patterns how-to, linked from the tutorial.
- tutorial step 2 simplified; a callout points to streaming-with-state for large
  outputs.
- test_examples_workers.py: stateless series still returns [0,1,2]; added an e2e
  test for the streaming variant.

Both series workers verified over the wire; strict build green; 65 doc-example
assertions pass; lint clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- how-to/state-storage.md: shared/persistent storage for distributed aggregates
  (default SQLite + cloud backends); distinguishes generator-cursor state from
  shared storage.
- how-to/http-auth.md: serve over HTTP and gate with bearer/JWT auth.
- how-to/pushdown-and-statistics.md: accept pushed-down filters and report column
  statistics for optimizer integration.
- concepts/index.md: rewritten from a stub into a real explanation — process
  model, transports, Arrow data model, call lifecycle, parallel workers.
- nav + how-to landing: surface the new recipes; demote the deep guides to
  "Reference:" entries. Removed all "under construction" notes.

Non-runnable illustrative Python is marked test="skip" so the example harness
stays green; strict build passes; lint clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@rustyconover rustyconover changed the title docs: MkDocs Material site with auto-generated API reference docs: MkDocs Material site — review-ready v1 (Diátaxis, tutorial, how-tos, CI gates) Jun 15, 2026
rustyconover and others added 5 commits June 15, 2026 16:49
…pets

The homepage showcased the full 75-line combined worker, which is heavy for a
teaser. Now it shows just the scalar function and just the table function as two
focused snippets, extracted from examples/calc_worker.py via pymdownx snippet
sections (markers are stripped, so the full-file embeds in the tutorial/catalogs
stay clean). Links to the tutorial for the complete worker.

Verified: section markers don't leak into full-file embeds; calc_worker still
serves; doc tests pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Inspired by the VGI Java docs' KindBanner/KindIcon. Each function-pattern
section now opens with a tinted card showing the shape's cardinality glyph, a
formula chip, and a one-line gloss — so the shape sticks before the prose.

- docs/assets/kinds/{scalar,table,table-in-out,aggregate}.svg: static
  cardinality glyphs (rows → ƒ → value/stack), one brand colour per kind.
- docs/stylesheets/extra.css: .kind-banner styles (theme-aware, color-mix tint),
  rendered via md_in_html + attr_list — no Vue/JS needed.
- docs/how-to/function-patterns.md: a banner under each of the four sections.

Strict build green; doc-example tests pass; banners verified rendering in light
mode via screenshot.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- Shapes gallery: a 5-card strip at the top of function-patterns (scalar,
  table, table-in-out, aggregate, buffering) — each a linked card with the
  cardinality glyph, name, and formula. Ported from the VGI Java KindGallery,
  rendered with md_in_html + CSS grid (no JS).
- Buffering coverage: examples/row_count_worker.py — a minimal buffering
  function (sink process -> combine -> source finalize) using params.storage
  for cross-process state. Verified over the wire (3+2 rows -> count 5) via
  client.table_buffering_function.
- New "Buffering" section + banner + buffering glyph (docs/assets/kinds/
  buffering.svg); buffering row added to the pattern table; contrast callout vs
  table-in-out.
- tests/test_examples_workers.py: buffering e2e test.

Strict build green; 67 doc/e2e assertions pass; gallery + banners verified via
screenshot.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Addresses the senior-DX review findings:

- Pattern count: reconcile four-vs-five — function-patterns intro now says five,
  tutorial says "three more patterns", home table adds the Buffering row.
- Home "Why VGI?" table: drop the false "single-threaded" claim; reframe the
  in-process vs. worker-process axis honestly and add an Arrow-IPC cost note.
- Taxonomy: use "pattern" in prose throughout (kept "shape" only for the visual
  glyphs); "kind" stays an internal CSS/asset name.
- Accuracy: Cloudflare DO backend uses httpx (not stdlib) — corrected. Pushdown
  how-to now uses params.current_pushdown_filters (already a decoded
  PushdownFilters), not a nonexistent params.pushdown_filters + deserialize.
- Funnel: how-to "Next steps" now link sibling how-to/concept before raw
  reference; reconcile state-storage vs shared-storage and concepts vs lifecycle.
- On-ramps: gloss vgi-serve on first use; explain the Python Client vs SQL
  ATTACH and the injected ctx; link execution_id to state storage.
- Add a troubleshooting callout to tutorial step 2; realistic DuckDB result
  header; how-to index "Reference & tooling" split; contributing-docs carve-out
  for advanced illustrative pages + tighter Next-steps rule.

Strict build green; 67 doc/e2e assertions pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ence fix

- Header logo: stop clipping to a circle — show the full wordmark at its natural
  aspect ratio (remove border-radius/object-fit:cover; height 1.8rem, width auto).
- Home: center "Built by 🚜 Query.Farm"; add the SVG cardinality glyphs as a
  "Shape" column in the Function patterns table.
- Lead lines: put "What this is" and "Who it's for" on separate lines (trailing
  <br>) across every tutorial/how-to/concept page; update the contributing-docs
  standard + template to match.
- Fix broken filter-pushdown/http-auth rendering: pymdownx.superfences mishandles
  the `test="skip"` info string and leaked the fence delimiters into the page.
  Switch illustrative blocks to plain ```python with a render-safe `# illustrative`
  sentinel; the doc-example harness skips on that marker instead. Documented the
  gotcha in contributing-docs.

Strict build green; fence leaks 0 across pages; 67 doc/e2e assertions pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@rustyconover rustyconover merged commit d9553d6 into main Jun 15, 2026
8 of 9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant