Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Script for detecting structural diffs in translated docs #1216

Open
toririm opened this issue Sep 25, 2024 · 2 comments
Open

Proposal: Script for detecting structural diffs in translated docs #1216

toririm opened this issue Sep 25, 2024 · 2 comments

Comments

@toririm
Copy link
Contributor

toririm commented Sep 25, 2024

Description

The biomejs/website organizes documentation in language-specific folders.
However, there is no mechanism to detect changes in the original English documentation that are not reflected in translations.

I propose creating a script to compare document structures across languages to maintain consistency.

Proof of Concept

I created a PoC script that parses MDX files into Abstract Syntax Trees (ASTs):

import fs from 'node:fs';
import { unified } from 'unified';
import remarkMdx from 'remark-mdx';
import remarkParse from 'remark-parse';
import { argv } from './utils/argv.js';

const processor = unified().use(remarkMdx).use(remarkParse);
const value = fs.readFileSync(argv("path"), 'utf-8');
const file = processor.parse(value);

console.log(JSON.stringify(file, null, 2));

For example:

$ node ./scripts/check-paragraph.js --path="./src/content/docs/guides/integrate-in-vcs.mdx" | grep heading
      "type": "heading",
      "type": "heading",
      "type": "heading",
      "type": "heading",

$ node ./scripts/check-paragraph.js --path="./src/content/docs/ja/guides/integrate-in-vcs.mdx" | grep heading
      "type": "heading",
      "type": "heading",
      "type": "heading",

The Japanese version is missing the Process only staged files section.

Proposal

I propose creating a script that checks for structural consistency across languages.
This could also be integrated into the CI pipeline.

Next Steps

If this proposal is accepted, I am happy to implement this script.

@dyc3
Copy link
Contributor

dyc3 commented Sep 25, 2024

I think this is reasonable. However, I don't think we should block PRs from merging if the script detects differences.

It would be nice to have issues filed automatically for detected diffs on main. That could get complicated though, so lets stick to a script that can be run manually for now.

@ematipico
Copy link
Member

Astro is working on this system. It's called Lunaria and they use it in their docs. Once the solution is ready, we can use it on our website.

However, we don't have people that consistently work on translations, so I don't see enough value to enforce or notify those differences if no-one is going to do the work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants