Skip to content

QeaToXmi: build Xmi::Sparx::Root directly, drop custom XmlBuilder/Emitters#2

Open
ronaldtse wants to merge 12 commits into
mainfrom
feat/qeatoxmi-via-xmi-gem
Open

QeaToXmi: build Xmi::Sparx::Root directly, drop custom XmlBuilder/Emitters#2
ronaldtse wants to merge 12 commits into
mainfrom
feat/qeatoxmi-via-xmi-gem

Conversation

@ronaldtse

Copy link
Copy Markdown
Contributor

What & why

Closes TODO.next/21. Replaces the custom XML construction layer in Ea::Transformers::QeaToXmi with xmi gem model construction. The Transformer now walks the database and instantiates Xmi::Sparx::Root / Xmi::Uml::UmlModel / Xmi::Uml::PackagedElement etc., then asks the xmi gem to serialize via to_xml(use_prefix: true). The Sparx mixed-prefix style is produced natively by the xmi gem (PR lutaml/xmi#87 landed in v0.5.11).

Changes

Deleted (~1000 lines):

  • lib/ea/transformers/qea_to_xmi/xml_builder.rb — low-level Nokogiri wrapper
  • lib/ea/transformers/qea_to_xmi/writer.rb — XML shape primitives
  • lib/ea/transformers/qea_to_xmi/emitter_registry.rb — OCP dispatch registry
  • lib/ea/transformers/qea_to_xmi/sparx_namespaces.rb — namespace constants
  • lib/ea/transformers/qea_to_xmi/emitters/*.rb — 14 element-specific emitters
  • matching spec files

Rewritten:

  • transformer.rb — single class walking the database and constructing xmi gem models. Element-kind dispatch (Class vs Enumeration vs DataType vs Instance) is a single case statement in #build_classifier. Adding a new kind = one new branch, not a new file plus registry entry. Polymorphism for XMI element shape lives in the xmi gem's xmi:type discriminator on PackagedElement, not in our code.
  • context.rb — slimmed (dropped writer dependency; kept database + IdAllocator).
  • qea_to_xmi.rb — autoload list pruned.

Spec coverage added:

  • Round-trip via xmi gem parser (parse output back through Xmi::Sparx::Root.parse_xml, verify structure).
  • API stability (idempotent serialize, no database mutation).
  • Plus all pre-existing parity, mixed-prefix, GUID, well-formedness specs continue to pass.

Verification

Check Result
Full ea gem suite 1931 examples, 0 failures, 37 pending
QeaToXmi specs 39 examples, 0 failures
Plateau smoke (20251010_current_plateau_v5.1.qea) 58 packages, 581 classes, 11 enumerations, 431 associations, 420 generalizations, 0 XML errors, 1.29 MB output, serializes in 1.9s, parses back via xmi gem in 1.7s
Plateau numbers vs previous impl exact match

Phase 2 deferred items (tracked in TODO.next/21)

These fell out of the rewrite but are intentionally not in this PR:

  1. xmi gem empty-element rendering — the xmi gem's VALUE_MAP is round-trip-oriented and forces empty-element emission (<generalization/>, <ownedEnd/>, etc.) on every collection mapping. This rewrite works around it by post-processing the output (Transformer#strip_empty_elements). The clean fix lives in the xmi gem: introduce a generation-friendly value_map. Small focused PR.
  2. xmi gem attribute gapsvisibility on Property/Operation/Parameter, isAbstract on Class, classifier on InstanceSpecification, aggregation on OwnedEnd. Each is a small PR to the xmi gem.
  3. xmi gem missing modelsSlot, OpaqueExpression, InterfaceRealization. Currently dropped because the xmi gem has no models for them.
  4. File-size refactortransformer.rb is 503 lines (369 LOC), over the ~300 guideline. Could split into Transformer + ElementFactory + RelationshipFactory. Deferred because the class is conceptually cohesive (one orchestrator with private walk methods) and splitting now would risk regressions for cosmetic gain.

Consumer impact

The serialized output is structurally identical to the previous XmlBuilder-based output (same EAIDs, same hierarchy, same element counts, same xmi:type discriminators). The bytes differ slightly because of the Phase 2 gaps above. Any consumer that round-trips through Xmi::Sparx::Root.parse_xml will see no difference.

ronaldtse added 12 commits June 27, 2026 22:54
Standalone Ruby gem for parsing Sparx Enterprise Architect data files
(QEA SQLite database and Sparx-flavored XMI). Namespace: `Ea::*`. The
gem is usable standalone; `lutaml-uml` is an optional dependency for
the EA-to-UML bridge.

Subsystems:
- `Ea::Qea` — SQLite-based EA database parser. Models, infrastructure,
  services, immutable Database container with hash indexes, repositories,
  factory (EA→UML via TransformerRegistry), validation, verification.
- `Ea::Xmi` — Sparx-only XMI parser (uses `::Xmi::Sparx::Root`).
  Cannot parse MagicDraw or Papyrus XMI; `.xmi` registration uses
  content detection to avoid claiming generic XMI.
- `Ea::Diagram` — Style resolution, layout, path building, element
  renderers, SVG extractor. StyleResolver is the single entry point;
  StyleParser holds only the BGR→hex color utility (MECE: orchestration
  vs parsing).
- `Ea::Transformations` — FormatRegistry-based parser dispatch with
  BaseParser template method. QEA and XMI parsers both produce
  `Lutaml::Uml::Document`.
- `Ea::Transformers` — QEA→XMI and UML→XMI emitters.
- `Ea::Cli` — Thor-based CLI with JSON/YAML/table output formatters.

Bridge to lutaml-uml uses composition (`Repository.from_document`) with
lazy requires inside method bodies — zero load-time cross-requires.

Config files in `config/`:
- `qea_schema.yml` — EA database table/column definitions
- `diagram_styles.yml` — default diagram styling
- `model_transformations.yml` — parser configurations

Specs: 1953 examples, 0 failures, 37 pending. No doubles, no
`send` on private methods, no `instance_variable_get/set`, no
`respond_to?`/`method_defined?`, no `require_relative` in lib/
(autoload only, declared in immediate parent namespace file).
- `examples/` — QEA standalone queries, QEA→Repository bridge, LUR
  workflows, and a real-QEA smoke test. Demonstrates both standalone
  usage (no lutaml-uml) and the optional UML bridge.
- `exe/ea` — CLI entry point.
- `docs/` — EA→UML type mapping reference and XMI↔QEA conversion
  capability notes.
- `.github/workflows/` — `rake` (test), `release`, `docs`, `link-check`.
  Replaces the generic `main.yml`.
Migrates all planning items to the numbered TODO.next/ format. Each
file carries a status header (DONE / PARTIALLY DONE / DESIGN-CORRECT)
and an explicit "what was applied" or "why deferred" section.

Items closed this session:
- 03 slim lutaml-uml — cross-requires eliminated (composition-based
  Repository.from_document API; zero load-time ea/lutaml-uml requires).
- 11 style MECE — stripped dead StyleParser API (6 unused methods,
  3 unused constants); StyleResolver owns EA-data-driven resolution,
  Configuration owns YAML-driven resolution.
- 15 exception narrowing — diagram Configuration now rescues only
  Psych::SyntaxError/Errno::ENOENT/EACCES/IOError; documented the
  intentionally-broad rescues at trust boundaries (DatabaseLoader
  callback isolation, per-record resilience in parsers).
- 16 repository indexes — TransformationEngine switched from
  unshift+pop to push+shift (O(1) append, amortized O(1) overflow);
  BaseRepository find already O(1) via lazy PK index; remaining
  O(n) scans audited and justified for EA repository sizes.
- 17 spec quality — stdlib method shadowing on BaseRepository audited
  as intentional ActiveRecord-like API design.
- 18 XMI architecture — Sparx-only Ea::Xmi::Parser is design-correct;
  each tool gets its own parser gem built on the xmi gem schemas.

Items already DONE in prior sessions: 00, 01, 02, 04 to 10, 12 to 14.
Two bugs:

1. `lutaml-uml` was declared as a hard runtime dependency, defeating
   the documented standalone design. The UML bridge already lazy-
   requires `lutaml/uml` inside `Ea::Qea.require_uml!` with a clear
   LoadError rescue, so users who only want QEA/XMI parsing shouldn't
   be forced to install `lutaml-uml` and its dep tree.

2. `lutaml-model` and `lutaml-path` were missing from the gemspec,
   relying on transitive pull via `lutaml-uml` -> lutaml-path /
   lutaml-model. Once `lutaml-uml` moved to dev-only, 11 specs broke
   with `LoadError: lutaml/path`. Both are load-time requires in
   `lib/ea/qea/models.rb`, `lib/ea/qea/models/base_model.rb`,
   `lib/ea/qea/services/configuration.rb`,
   `lib/ea/transformations/configuration.rb`, and `lib/ea/xmi/parser.rb`.

Fix:
- Remove runtime `lutaml-uml` dep
- Add dev `lutaml-uml` dep (for spec suite)
- Add runtime `lutaml-model` and `lutaml-path` deps

Suite: 1953 examples, 0 failures, 37 pending (unchanged).

Documented in TODO.next/19.
Two CI fixes for the ea gem, which has no frontend/ directory:

1. rake.yml and release.yml referenced 'cd frontend && npm install &&
   npm run build' inherited from a Cimas template that assumed a JS
   frontend. ea is pure-Ruby, so the steps failed with 'cd: no such
   file or directory: frontend'. Removed from both workflows.

2. Gemfile.lock only declared the arm64-darwin-23 platform (the
   developer's macOS). CI runs on x86_64-linux and ruby, causing
   'Could not find compatible gem' for native extensions (ffi,
   nokogiri, sqlite3). Added x86_64-linux, x86_64-linux-gnu, and
   ruby platforms via 'bundle lock --add-platform'.
Two coupled changes so CI fails loudly with a useful error instead of
the opaque 'path does not exist' message:

1. Gemfile: lutaml-uml and canon are sibling-repo path dependencies
   in local dev (../lutaml-uml, ../canon). When the sibling checkout
   doesn't exist (CI, gem install), fall back to the published
   rubygems versions. EA_FORCE_RUBYGEMS=1 forces rubygems mode locally
   to reproduce the CI-resolved dependency set.

2. ea.gemspec: pin the dev dep to '~> 0.2.0'. The bridge code and
   spec suite target the pre-1.0 API (Lutaml::Uml::UmlClass,
   Lutaml::UmlRepository). 1.x renamed these constants; bridge work
   is needed before unpinning.

3. Gemfile.lock regenerated against rubygems so CI installs published
   versions rather than failing on missing sibling paths.

CI will still fail at spec_helper load: 'cannot load such file --
lutaml/uml_repository'. No published lutaml-uml version ships that
file (audited 0.2.0, 0.2.12, 0.3.0, 0.4.3, 1.0.0). Unblock requires
a new lutaml-uml release containing uml_repository. Details and
follow-up architecture in TODO.next/20.
Documents the rationale, scope, and Phase 2 deferred work for
replacing the custom XmlBuilder/Writer/Emitters layer with xmi gem
model construction. The companion implementation commit follows.
…/Emitters

Replace the custom XML construction layer with xmi gem model
construction. Transformer walks Ea::Qea::Database and instantiates
Xmi::Uml::UmlModel / PackagedElement / OwnedAttribute / OwnedEnd /
MemberEnd / AssociationGeneralization / OwnedLiteral / OwnedComment /
OwnedOperation / OwnedParameter / UpperValue / LowerValue, then asks
the xmi gem to serialize via to_xml(use_prefix: true). The Sparx
mixed-prefix style (root and Documentation prefixed, all other UML
children unprefixed) is produced natively by the xmi gem.

Deleted (~1000 lines):
- xml_builder.rb, writer.rb, emitter_registry.rb, sparx_namespaces.rb
- emitters/{base, package, class, enumeration, data_type, instance,
  attribute, operation, association, generalization, realization,
  dependency, comment, slot}_emitter.rb
- matching spec files for deleted modules

Element-kind dispatch (Class vs Enumeration vs DataType vs Instance)
lives in a single case statement in Transformer#build_classifier —
adding a new kind is one new branch, not a new file plus registry
entry. Polymorphism for XMI element shape lives in the xmi gem's
models (xmi:type discriminator on PackagedElement), not in our code.

Context slimmed: drops Writer dependency, keeps Database + IdAllocator.

Plateau smoke (20251010_current_plateau_v5.1.qea) matches the previous
implementation exactly: 58 packages, 581 classes, 11 enumerations,
431 associations, 420 generalizations, 0 XML errors, ~1.3 MB output,
serializes in 1.9s, parses back via xmi gem in 1.7s.

Full ea gem suite: 1931 examples, 0 failures, 37 pending.

Phase 2 deferred items (tracked in TODO.next/21):
- xmi gem empty-element rendering (currently post-processed via
  Transformer#strip_empty_elements)
- xmi gem attribute gaps (visibility, isAbstract, classifier,
  aggregation, direction)
- xmi gem missing models (Slot, OpaqueExpression, InterfaceRealization)
- File-size refactor of transformer.rb (503 lines; could split into
  Transformer + ElementFactory + RelationshipFactory)
Audit-driven refactor of Ea::Transformers::QeaToXmi addressing all
findings flagged in code review. Each TODO is fully implemented and
verified; full suite passes 1995 examples, 0 failures, 37 pending
(up from 1931 — added 64 new specs).

Critical fixes
-------------
22  Strip respond_to? from transformer_spec.rb — replaced with
    explicit is_a?(::Xmi::Uml::PackagedElement) check per project rule.
23  Clean IdAllocator: drop dead LITERAL_UNLIMITED constant, drop
    for_multiplicity (ignored its first arg), DRY-merge into allocate.
24  Tighten parity specs: assert exact class/enum/data_type/instance
    counts via transformer_type filter instead of loose range check.
25  Sparx-conformant EAID format for synthesised IDs:
    EAID_LI000001__<guid_tail> matches real Sparx byte-for-byte
    (was bare LI000009). Parent GUID now passed to IdAllocator.
26  Always emit upperValue/lowerValue on Property (defaults lower=0,
    upper=-1 for blank QEA fields). Association-end path is blocked
    on the xmi gem's OwnedEnd schema gap (tracked separately).

Architecture improvements
-------------------------
27  Extract Cardinality module: pure-function bound parsing lives in
    its own file. Includes normalize_upper/lower + parse + UNLIMITED_TOKENS.
28  Extract XmlSanitizer: single-pass depth-first post-order removal
    of empty elements (was O(N^2) while-loop). Never strips root.
29  OCP registry for classifier builders: CLASSIFIER_BUILDERS hash of
    lambdas dispatched via instance_exec. Adding a new kind = one
    entry, no method change. No send/public_send used.
30  AssociationEnd Struct replaces ad-hoc {xmi_id:, model:} Hash.
    Typos now raise NoMethodError instead of silently returning nil.
33  normalize_lower was identity; now normalises empty -> "0" matching
    UML unspecified-lower-bound convention.
34  Document Sparx member-end ordering (destination first, source
    second — round-trip depends on it). RETURN_PARAMETER = "RT"
    constant added to IdAllocator's documented Sparx prefixes.

Spec coverage expansion
-----------------------
31  New id_allocator_spec.rb (157 lines): allocate counter, prefix,
    seed memoisation, parent_guid incorporation, well-known constants.
32  Phase 2 sentinel specs assert visibility / isAbstract / aggregation
    / classifier / upperValue-on-ownedEnd are absent today. When the
    xmi gem adds support, these flip to positive assertions.
+   New cardinality_spec.rb and xml_sanitizer_spec.rb for the two
    extracted modules (118 and 95 lines respectively).

File sizes (was 503 LOC transformer.rb):
  lib/ea/transformers/qea_to_xmi/transformer.rb    469 (-34)
  lib/ea/transformers/qea_to_xmi/cardinality.rb     96 (new)
  lib/ea/transformers/qea_to_xmi/xml_sanitizer.rb   68 (new)
  lib/ea/transformers/qea_to_xmi/association_end.rb 19 (new)
  lib/ea/transformers/qea_to_xmi/id_allocator.rb    92 (+16)

Rule compliance verified:
  - No send / __send__ / public_send in lib/
  - No instance_variable_get/set
  - No respond_to? (the one match is in a comment explaining why)
  - No require_relative in lib/
  - No doubles in specs
  - Autoload-only structure preserved in qea_to_xmi.rb
Depends on the xmi gem refactor branch `refactor/owned-end-schema-gap`
which unified OwnedEnd's schema with OwnedAttribute and added the
missing UML models (Slot, OpaqueExpression, InterfaceRealization).
The Gemfile now treats xmi as a sibling path dep alongside lutaml-uml
and canon.

Closes TODO 26 fully (was PARTIAL — association-end path was blocked
on the xmi gem gap). Implements Phase 2 items from TODO 21 §2 and §3.

Visibility module (lib/ea/transformers/qea_to_xmi/visibility.rb)
----------------------------------------------------------------
Pure-function mapper from EA's integer scope/containment codes to
UML visibility / aggregation wire strings.

- Visibility.from_scope(int) — Public/Private/Protected/Package
- Visibility.aggregation_from_containment(int) — nil/shared/composite
- Visibility.boolean_from_flag("1"/"0") — "true"/"false"

Transformer wiring
------------------
- build_attribute: emit visibility, is_static, is_ordered, is_derived.
- build_operation: emit visibility, is_static, is_abstract, is_query,
  concurrency.
- build_class: emit visibility, is_abstract.
- build_enumeration / build_data_type: emit visibility.
- build_instance: emit visibility, classifier (from pdata1), slot
  (Phase 1 emits empty array — Phase 2 will walk RunState).
- build_association_end: emit aggregation (from source/dest
  containment). upperValue/lowerValue now actually serialize
  correctly thanks to the xmi gem schema migration.

Spec changes
------------
- visibility_spec.rb (new, 21 examples): full coverage of the three
  Visibility mapper methods.
- transformer_spec.rb: split the old "Phase 2 gaps" sentinel block
  into two:
    * "Phase 2 wiring (xmi gem schema migration landed)" — positive
      assertions for visibility on Property/Operation, isAbstract on
      packagedElement, upperValue/lowerValue on ownedEnd.
    * "Phase 2 gaps still deferred" — aggregation and classifier
      remain negative because basic.qea doesn't carry the relevant
      data; flip when a fixture exposes them.

TODOs
-----
- 26: marked fully DONE; documents both the ea-side fix and the
  xmi gem schema migration.
- 32: marked DONE; documents the sentinel-flipping pattern.

Verification
------------
- qea_to_xmi specs: 124 examples, 0 failures
- Full ea suite: 2016 examples, 0 failures, 37 pending
- Output now emits visibility on 102 attributes + 15 operations,
  isAbstract on 65 classes, upperValue/lowerValue on 102 attributes
  + 80 association ends (was 102 before xmi gem schema migration).
The xmi gem's generation-friendly VALUE_MAP (commit on
refactor/owned-end-schema-gap) means the serializer skips
empty/nil elements at the source. The XmlSanitizer post-processing
pass is no longer necessary.

Changes:
- lib/ea/transformers/qea_to_xmi/transformer.rb: `serialize` now
  returns `build_root.to_xml(use_prefix: true)` directly. No
  re-parse, no element mutation, no second serialization.
- lib/ea/transformers/qea_to_xmi.rb: drop XmlSanitizer autoload.
- lib/ea/transformers/qea_to_xmi/xml_sanitizer.rb: deleted.
- spec/ea/transformers/qea_to_xmi/xml_sanitizer_spec.rb: deleted.
- TODO.next/28 updated to document the full lifecycle
  (extracted → superseded → deleted).

Pipeline before:
  build model → to_xml → Nokogiri::XML → mutate → to_xml

Pipeline now:
  build model → to_xml

Closes TODO 21 §1 — xmi gem empty-element rendering (architectural
debt). The 3-line post-processing pass the ea gem used to do
(re-parse → remove empties → re-serialize) is gone.

Verification:
- Full ea suite: 2005 examples, 0 failures, 37 pending.
- Output contains zero truly-empty elements (verified with
  `xml.scan(/<generalization\s*\/>/).size == 0` and same for
  ownedEnd / other collections).
- Round-trip via Xmi::Sparx::Root.parse_xml still succeeds.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant