Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ci] Ensure i18n up to date #397

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

sneakers-the-rat
Copy link
Contributor

@sneakers-the-rat sneakers-the-rat commented Aug 26, 2024

This PR adds a CI action to ensure that the generated translation .po files reflect any changes to the source text.

It also includes a run of the update-translations nox session which both pulled in changes from the english source documents as well as did some cosmetic reformatting

The action just runs the update-translations session and checks if any files were changed. When there are no changes to the source documents, the files remain unchanged, so those generation dates shouldn't be a problem. It seems like that update-translations command also generally acts like a linter, breaking the translated strings across lines and putting them in a standard format, so this action enforces that. If we wanted a cleverer solution to this, i am sure there is some way of declaring the actual contents of .po files rather than this naive approach of just looking for any changes, but it should be alright for now :)

Requesting review from the folks that have been doing the translation to make sure i didn't break anything here!
edit: i guess i don't have permissions for this repo to add requested reviewers or labels, but anyway ya @tkoyama010 and @flpm

edit2: oh and PS i also mildly updated the main build action to use the setup-python pip cache instead of the other cache (which was incorrectly caching against a nonexistent requirements.txt file), but that should be function-neutral

@flpm flpm requested review from flpm and tkoyama010 August 26, 2024 19:26
@flpm
Copy link
Member

flpm commented Aug 26, 2024

I added us as reviewers for you :)

On one side it's nice to not risk anyone starting work on outdated files, but there is one edge case I worry about.

If there are lots of changes to the original text and the PO files change a lot, it may cause annoying conflicts when merging current open translation PRs, because the translation-update merge (sphinx-intl) is PO-format aware and the github one is not, so it would be easier to merge the PR into an outdated PO file before updating it.

It can be sorted out by updating the PR branch and then updating the PO files locally, but that puts more pressure on the contributor. If they are more on the beginner side, it would require help from a maintainer to sort it out.

@sneakers-the-rat
Copy link
Contributor Author

I don't think I quite get what you mean! If one is working on a translation PR, then the update shouldn't do anything except reformat strings to be below the max line length, right? i guess it's something to be tested, but the remedy in that case would just be to run nox -s update-translations and commit the result.

Maybe it's my sleep-deprived brain being bad at reading atm, could you help walk me through an example that would be challenging to contributors and i can see how i can make that easier?

@flpm
Copy link
Member

flpm commented Aug 26, 2024

My comment was also a bit confusing 😆 sorry.

The main goal of the update translation is to keep the PO file in sync with the original text without discarding current translations as much as possible. It will extract the new strings from the original and put them in the PO file but it's smart enough to detect smaller changes (eg typo was fixed or small rephrasing inside a larger paragraph) are still the "samish" string and that it should keep the existing translations. When this happens it marks the entries in the PO file as fuzzy so a translator knows that they have to check the older translation still applies and make any changes as needed.

@sneakers-the-rat
Copy link
Contributor Author

sneakers-the-rat commented Aug 26, 2024

Yes! That much I gathered, im curious what your thinking this action would complicate so I can make it not do that - really all its doing is making sure that if the English source changes that the nox -s update-translations command was run afterwards. It doesnt change the code or anything

@tkoyama010
Copy link
Member

Using translation memory can greatly reduce the effort required to perform translation that keeps up with changes. It might be a good idea to add an explanation of translation memory to the translation section in a separate PR.

@flpm
Copy link
Member

flpm commented Aug 26, 2024

So I was thinking of the case when translation PRs are open pending review for a while (like right now, my fault!) and then have to be merged against PO files that where updated since they where branched by the CI of other PRs (for example a PR unrelated to translation)

On the other hand, if the lead of a translation decides when the PO files update, they can account for any pending translation PRs and changes to the guide and choose the best time to update the PO files to minimize merging conflicts.

@flpm
Copy link
Member

flpm commented Aug 26, 2024

Using translation memory can greatly reduce the effort required to perform translation that keeps up with changes. It might be a good idea to add an explanation of translation memory to the translation section in a separate PR.

We have issue #341 to work on a translation guide and memory of translation.

But my concern here is more on the race condition between an old PR and the original guide evolving.

@tkoyama010
Copy link
Member

tkoyama010 commented Aug 27, 2024

Yes, I agree about the conflict issue. The Japanese translation is currently being translated for the first time, so it will be easier to understand this update after that is completed. However, I did not want to hinder the momentum of this change. My intention was to let know that I can also handle if we merge this first.

@sneakers-the-rat
Copy link
Contributor Author

I dont have strong feelings about this PR, so if its no good it's no problem to close it. I hear the concerns re racing translation, to some degree thats true of all PRs, but if it is a barrier to people not used to git things then yes thats a problem.

The reason for having it ( I think we all get this, but for the sake of saying aloud) is to prevent a situation where the English text wanders off for awhile and the translations become slowly and invisibly out of date. The CI action forces the versions to stay in sync, and makes the labor incurred by frequent changes to the English on translated copies more visible. Ideally I would hope that we can accommodate translations in smaller pieces, rather than asking someone to sit down and translate the entire guide in one PR, so the race is less of a problem, but yes it is always there.

So im happy to do whatever with this - if it's something we just want to run as a cron task that generates PRs rather than a CI test that blocks merges, or drop it altogether, fine by me.

@tkoyama010
Copy link
Member

This PR will definitely be needed in the future. Therefore, there is no problem with merging if the merge timing is scheduled correctly. I will comment here when we get to the stage of merging this. Thanks for the great contribution!

@flpm
Copy link
Member

flpm commented Aug 27, 2024

Yeah, I absolutely agree! We should not abandoned this PR, sorry if my comments felt that way.

I think it is more of an issue of the lifecycle of the language translation.

Here is an idea: we leave this PR as is, but modify the update-translations session to only act on released languages. That way we avoid the conflicts during the early stages where they are more likely to happen but get the full benefit of the updates later on.

The discussion also made me think that maybe there should be a way (nox session) to update languages individually, because the same could happen outside CI. Maybe something like update-language, that you can call with a language code as a parameter. That could be used until the languages are released when the CI would start keeping it up to date.

Whatever the final solution we make for timing, we should account for future new languages. Hopefully we will expand the list to cover many!

@sneakers-the-rat
Copy link
Contributor Author

modify the update-translations session to only act on released languages.

I like this, lets do this :)

@flpm
Copy link
Member

flpm commented Aug 27, 2024

I will make a quick PR for that then!

@sneakers-the-rat
Copy link
Contributor Author

Invited you to my fork if that would be easier, or if you want to make a new PR off this we can close this one. Either way ♥

@flpm
Copy link
Member

flpm commented Aug 27, 2024

Oops, missed that, made a separate PR #398

@sneakers-the-rat
Copy link
Contributor Author

no worries, once that goes in i can update this

@sneakers-the-rat
Copy link
Contributor Author

OK since #398 got merged, i think this was the next step on my todos?

So previously we were talking about how there might be some race between changes in the english text and a translation in progress, which would be a pain, if CI forces you to update the .po file frequently.

@flpm added a mode to just enforce the release languages staying up to date, which i think is a good step, to allow translations-in-progress to be unbothered.

One problem i think might be that say a bunch of work is done translating for a new language and it comes time to enable that language as a release language - the PR that flips that switch might be then confronted with a bunch of out of date .po files, so then things might need to be translated twice, where someone was working on .po files that were out of date with the english source.

What would be good here? my initial default was just to ensure that all .po files are always up to date with the english source, but that might be too much of a pain. On the other side, only enforcing the .po files being up to date at the end might allow too much drift to happen so work gets done twice. You tell me what would be good and i'll do it!


Separately, there are a number of changes to the .po files in this PR from when it was enabled for all languages, i am assuming that i should revert those before we merge, but i am leaving them in there for now as an example of the kind of changes that this CI action would ask for before a merge.

@flpm
Copy link
Member

flpm commented Oct 1, 2024

I think while the language is not released, it's the responsibility of the language coordinator to keep it updated (choosing the best moments to do so), including making sure it is up to date at the moment of the release.

I don't we should have issues that would cause the need to translate twice. Unless there was a big reorganization of the original text, but in that case it would require special attention from the language team anyway.

It's more likely that there would be lots of small changes that would mark some entries as fuzzy but that can be quickly resolved in a subsequent PR.

@flpm
Copy link
Member

flpm commented Oct 1, 2024

Yeah, I think you can revert them. After I merge the last 2 PRs pending review I will plan to update the Spanish ones.

This reverts commit 0463f00.
@sneakers-the-rat
Copy link
Contributor Author

Alright, reverted :)

I think while the language is not released, it's the responsibility of the language coordinator to keep it updated (choosing the best moments to do so), including making sure it is up to date at the moment of the release.

Great! works for me. so this will apply only when someone updates the english source (they will have to update any release languages) or when a translation is flipped to release, which seems right.

#394 has the single-sourcing of the release languages, build, live-preview, and selector for multi-languages, so then I think that's the last thing (aside from all the work of finishing the translations, which i wish i could help with) before it's possible to deploy them once they're finished?

@willingc willingc added the translations Translation and localization of content label Oct 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
maintenance translations Translation and localization of content
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants