Round-trip fidelity
Per-tag fidelity audit + three-way harness vs LibreOffice and OnlyOffice.
How we make sure a .docx opened and saved by Casual Editor stays byte-faithful to Microsoft Word’s representation.
For deeper internals, see internal/01-fidelity-gaps.md and internal/03-gap-matrix.md.
The pipeline
file.docx
│ unzip
▼
parts/document.xml parts/styles.xml parts/numbering.xml ...
│ parse (packages/core/src/docx/)
▼
Document model ──→ toProseDoc ──→ ProseMirror state ──→ layout-painter (visible)
│
│ edits
▼
fromProseDoc ←── ProseMirror state
│
▼
Document model
│ serialize
▼
document.xml + styles.xml + …
│ rezip
▼
file.docx
Round-trip succeeds when the re-serialized bytes are semantically equivalent to the original. The audit script in docx-editor/scripts/roundtrip-audit.mjs parses every fixture, re-serializes, and tag-diffs document.xml to flag silently-dropped elements.
What’s covered
- WordprocessingML core — paragraphs, runs, tables, lists, sections, hyperlinks, footnotes/endnotes, custom XML, math equations
- DrawingML — pictures, shapes, textboxes (modern + VML fallback),
wpg:wgpgroups with per-child positioning and rotation/flip, decorative shapes, connector lines, image hyperlinks - Comments and tracked changes
- Styles — paragraph + character + theme colors + theme fonts + inheritance
- Tables — borders (7 modes), shading, merged cells, header row, row height, table styles
- Lists — multi-level numbering, contextual spacing, bullet styles
Test discipline
Each fidelity gap fix is pinned by:
- A unit test in
docx-editor/packages/core/src/docx/__tests__/*.test.tsthat asserts the parse→serialize round-trip preserves the relevant attribute(s). - Where the gap produces a visible output difference, an e2e spec in
docx-editor/e2e/tests/that opens the fixture and asserts the rendered DOM matches expectations.
Both are required before a fidelity PR lands. The pattern means a regression always trips at least one of CI’s 800+ e2e tests.
Current status (re-audited 2026-05-25)
44 of 44 fixtures round-trip with zero per-tag drops = 100 % pristine.
Target was ≥ 90 % before the desktop ship — floor cleared.
Five new fixtures added during the recent fidelity sweep
(drawing-fidelity, table-overlap, table-column-resize,
word-compat-closing-border, page-color) all land zero-drop on
their first pass. The previously-deferred VML cluster (~108
dropped tags across medical-incident-form + sds-real-world)
closed earlier in commit 302c210 via raw-XML envelope capture
in the enricher.
What this does not claim: it’s not byte-equal (attribute ordering
etc. may differ — tag-count parity is stricter than byte-equal in
practice but weaker than literal byte-equal), and it’s not visual
fidelity. The remaining open gaps in
internal/03-gap-matrix.md are about
on-screen rendering, not whether bytes survive a load → save cycle.
The full per-tag history (~2,400 dropped tags eliminated across 16+
commits) is in roundtrip-audit-report.md in the editor repo root.
Synced from docs/ROUNDTRIP.md in schnsrw/docx. To update: edit upstream and re-run npm run sync-docs.