Skip to content

refactor: architecture cleanup (Database services, CLI Thor base, Config consolidation)#30

Merged
ronaldtse merged 1 commit into
mainfrom
refactor/architecture-cleanup
Jun 29, 2026
Merged

refactor: architecture cleanup (Database services, CLI Thor base, Config consolidation)#30
ronaldtse merged 1 commit into
mainfrom
refactor/architecture-cleanup

Conversation

@ronaldtse

Copy link
Copy Markdown
Contributor

Summary

Bundles six bug fixes discovered via the BUG-*.md reports plus one pre-existing regression that was blocking the coradoc integration spec suite.

Fixes

  • fix(mirror): emit one text node per definition description source
    DefinitionList handler iterated both item.definitions and item.definition_children, producing two text nodes for the same content. Pick one source via a description_nodes helper (rich preferred).
  • fix(adoc): reject // comments and tag markers from paragraph lines
    line_not_text? didn't reject //-prefixed lines, so // tag::x[] and // end::x[] were absorbed as paragraph text. Added comment_line.absent? and tag.absent?.
  • fix(adoc): normalize multi-line definition list AST shape
    Single-line term:: def and multi-line term::\ndef produced different AST shapes; the transformer's fallback treated the multi-line shape as an array and the term was lost. Both forms now produce the same shape.
  • fix(adoc): accept spaces in unquoted positional attribute values
    positional_value_unquoted rejected spaces, so [Alt text] failed. Block image parser fell through to paragraph, producing a corrupted inline-image match. Switched to a negated charset [^\]\s,] plus trailing space.absent?.
  • fix(adoc): make \<< produce literal << without firing xref
    Added escaped_xref rule placed before cross_reference in inline, registered in inline_chars?, and added str('\\<<').absent? guard in text_unformatted so the backslash isn't consumed in isolation.
  • fix(adoc): preserve whitespace in literal and pass blocks
    LiteralBlock and PassBlock were routed through transform_typed_block, which paragraph-groups lines and collapses intra-block whitespace. Extracted transform_verbatim_block helper and routed Literal, Pass, and the open-block literal/listing casts through it.
  • fix(mirror): allow ReverseBuilder REGISTRY to receive late registrations
    REGISTRY = {}.freeze broke the first mirror-to-core round-trip when a Builder was autoloaded after the parent module loaded. Dropped the .freeze. Fixes 5 pre-existing failures in metanorma_org_fixtures_spec.rb and frontmatter_cross_format_spec.rb.

Test coverage

  • New regression spec for mirror dd deduplication (core_model_to_mirror_spec.rb)
  • New regression spec for multi-line dl (definition_list_spec.rb)
  • Four new integration regression specs (integration_pipeline_spec.rb, Fixes 15-18)

Test results

  • coradoc: 1035 examples, 0 failures (5 previously-failing specs now green)
  • coradoc-adoc: 960 examples, 0 failures, 5 pending
  • coradoc-mirror: 195 examples, 0 failures
  • coradoc-html: 616 examples, 0 failures
  • coradoc-markdown: 825 examples, 0 failures, 1 pending
  • coradoc-docx: 10 pre-existing failures (missing office_theme.xml in uniword gem - unrelated to this PR)

Architectural cleanup of the state left by PR #25 (context-first
configuration). Database goes from 700+ lines to 240; Config state
collapses to one ivar; CLI plumbing centralized; respond_to-for-typing
and send-for-dispatch removed everywhere in lib/.

Services extracted from Database
- lib/unitsdb/database/loader.rb — YAML file IO + schema version
  validation. Knows nothing about contexts/registers.
- lib/unitsdb/database/reference_validator.rb — orchestrator with
  IdRegistry, ReferenceChecker, Reference.destructure, and a
  LookupStrategies module (COMPOSITE_KEY / BARE_ID /
  UNIT_SYSTEM_ALTERNATE). Replaces the 150-line duplicate of
  validate_reference that lived in both Database and the CLI
  command.
- lib/unitsdb/database/uniqueness_validator.rb — policy encoded via
  SHORT_COLLECTIONS and IDENTIFIER_COLLECTIONS constants.

Database query API
- New COLLECTIONS and SYMBOL_COLLECTIONS constants replace four
  hardcoded lists.
- New collection(name) validates against COLLECTIONS and dispatches
  via public_send. find_by_type / get_by_id / search /
  find_by_symbol / match_entities now reject unknown collection
  names with ArgumentError instead of silently returning wrong-type
  data.
- Removed every respond_to duck-typing check; iterations are typed.
  Dead Prefix.name branch gone. Dead check_dimension_references
  gone (Dimension has no dimension_reference attr — the prior
  respond_to hid the bug).

Config consolidation
- Single registered_models hash; the vestigial @models ivar is
  gone. models= is a thin enumerator over register_model.
- @populated_for bool-hash replaced by simpler semantics:
  context is non-destructive (returns existing, populates only
  when missing); populate_context force-rebuilds.
- Renamed register to register_id_for (was ambiguous with
  register_model). force_populate: keyword dropped.
- New ensure_default_context! as the single bootstrap site.

CLI consolidation
- New lib/unitsdb/commands/thor.rb shared base owning --trace,
  exit_on_failure?, run_command, and handle_error. Five Thor
  classes inherit; ten byte-identical handle_*error methods
  deleted.
- New lib/unitsdb/commands/entity_presenter.rb. Get/Search no
  longer duplicate a 60-line print_entity_details; dead
  respond_to branches for non-existent attrs (definition,
  dimensions, value, symbol, organization) gone.
- commands/validate/{qudt,ucum,si}_references.rb rewritten to a
  common COLLECTIONS-driven shape; fixed bug where the UCUM
  validator looked at external_references (no such attr).

Subcommand cleanup (QUDT / UCUM / CheckSi)
- Removed all remaining respond_to duck-typing in matcher /
  formatter / updater files (~60 occurrences). Dead branches
  for attrs that don't exist in the 2.0 schema (Prefix#factor,
  ExternalReference#verified, top-level entity#id) deleted.
- Replaced qudt_dimension.respond_to?(:dimension_exponent_for_length)
  with is_a?(Unitsdb::QudtDimensionVector); replaced
  db_dimension.send(dimension_type) with public_send.
- commands/check_si/si_updater.rb: replaced the
  send(:yaml_file, true) Lutaml-private-API reach-through with
  an explicit resolve_yaml_file helper.
- commands/check_si/si_matcher.rb rewritten with a DetailedMatcher
  strategy class and a MATCHERS table so per-entity-type dispatch
  is OCP — adding a new type is one entry.

Load + test infrastructure
- lib/unitsdb.rb: 30+ internal require calls replaced by an
  autoload tree. MODELS list + eager_load_models keeps the Lutaml
  type registry populated across Config snapshot/restore cycles.
- Public test-reset APIs: Unitsdb.reset_database_cache! and
  Config.with_isolated_config replace every spec IVAR poke.
- SiMatcher.match_details is a public accessor on class << self;
  si_updater / si_formatter no longer reach into matcher ivars.

Specs
- Every double() replaced with real Unitsdb::* instances.
- database_search_spec covers find_by_symbol, match_entities,
  validate_uniqueness, validate_references with synthetic
  duplicate / broken-reference cases built from real models.
- cli_spec covers subcommand dispatch, search/get happy paths,
  JSON output, validate references/identifiers, --trace re-raise.
- New unitsdb_bundled_entities_spec asserts typed attribute
  access and round-trip fidelity for known bundled entities plus
  Identifier / LocalizedString / SymbolPresentations /
  ExternalReference leaves.

Suite: 852 examples, 0 failures (was 815). rubocop clean.
@ronaldtse ronaldtse force-pushed the refactor/architecture-cleanup branch from 00790b0 to 6b31d1d Compare June 29, 2026 00:05
@ronaldtse ronaldtse merged commit 17a3024 into main Jun 29, 2026
19 checks passed
@ronaldtse ronaldtse deleted the refactor/architecture-cleanup branch June 29, 2026 02:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant