Markdown input/export loss of line breaks

hamishwillee · March 26, 2024, 9:16pm

I have a source markdown vitepress aside like this:

:::tip
This guide contains everything you need to assemble, blah blah.
:::

It gets split into

:::tip
This guide contains everything you need to assemble, blah blah.

:::

And after coming out of crowdin it looks like this:

:::tip
This guide contains everything you need to assemble, blah blah. :::

However having the ::: on the same line breaks the rendering. Do I have other options that adding extra whitespace lines?
`

Dima · March 26, 2024, 9:49pm

Hello,

Can you share the file sample (the whole file) with me so that I can use it for test purposes?

Also, can you provide some info about file origin, is the file self-created or imported by some system?

Depending on the file origin the solution may be one or another, so it would be great to know your full workflow

hamishwillee · March 26, 2024, 10:25pm

Hi Dima,

Thank you. The files is here: PX4-user_guide/en at main · PX4/PX4-user_guide · GitHub and you can see the exports from crowdin in the language specific folders.

The pages has a :::warning near the top that follows this problematic rendering (note it also has a :::tip where I forced line space - which I would prefer not to do).

The file is created manually and imported into Crowdin using the github integration.

Tetiana · March 26, 2024, 11:23pm

Hi!

Thanks for the details, checking it on our side

hamishwillee · April 3, 2024, 9:20am

Thanks @Tetiana - any progress? I’m also sometimes getting no breaks on the start - so

::: note 
Fred

is being turned to

::: note Fred

That’s a real problem. I wonder if a splitting rule might help.

hamishwillee · April 11, 2024, 3:36am

I’ve asked about splitting rules here: Markdown segmentation - #3 by hamishwillee

So far they split nicely, but by that does not help the fact that crowdin is eating the line break and not including it in the output.

@Tetiana Did you have a chance to look at this? Is there a workaround other than changing my docs from

::: note 
Fred
:::

TO

::: note 

Fred

:::

Tetiana · April 11, 2024, 4:11am

Hi!
We have the task created on the matter on our side, it is in progress right now
Once we have any news - we’ll contact you with the updates asap

Olena · November 18, 2024, 7:28pm

Hi @hamishwillee ,

I hope you are doing well,

Our developers have implemented a fix for how we parse MD files. These changes are included in the latest version of the MD file parser. It will be necessary to re-upload the file to Crowdin.

You can easily migrate all the translations from the old file to the newly uploaded one while preserving translations, approvals, and authority by following these steps:

Rename your existing file in Crowdin (e.g., from example.md to example_old.md or any name you prefer).
Upload the example.md file again to the project, so you will have both the old and new files in Crowdin.
Set the duplicates option to “Hide (strict detection)” to ensure that translations from the old file migrate as hidden duplicates to the new example.md .
Once the migration is complete, you can delete example_old.md and continue working with the new example.md file.

Let me know if you need any further assistance

hamishwillee · November 20, 2024, 12:25am

Thanks @Olena . I’ll give it a try. I do argue “easily” in "You can easily migrate all the ". We have a lot of translated files - maybe thousands.

Natalia · November 20, 2024, 2:05am

Hi @hamishwillee , you can use API tool for some points:
https://support.crowdin.com/developer/api/v2/

hamishwillee · November 20, 2024, 4:39am

Thanks - can I just do a sync so that it will find the original version and re-upload?

@Natalia Thanks. Also not “easy” though I’ll give it a go. If I can rename everything and just sync then that might work.

Natalia · November 20, 2024, 4:52am

Dear @hamishwillee , yes, you’re welcome

hamishwillee · November 20, 2024, 6:54am

Thank you. I tried a tester with just one file renamed and the new setting to “Hide”.
Then I re-ran my github integration sync.

The result is that

the file I renamed was not reimported
Many many strings in the translations have been destroyed - contain gibberish etc.
Some strings got a new translation, which is nice.

Not sure how to revert yet, but I don’t think this approach is all that robust. I suspect you do have to reupload the clean version of the original for it all to work.

TaniaM · November 20, 2024, 7:24am

Hi @hamishwillee,

Could you please share some more details on what file you have renamed that was not imported as well as some examples of translations that have been destroyed?

In order to make everything work, it’s necessary to rename the source file that is already in Crowdin, and after uploading the new file, which will be uploaded as a duplicate, the translations will migrate from the old file to the new one.

hamishwillee · November 21, 2024, 3:26am

Hi @TaniaM

Sorry, I wasn’t very clear. The problem was with setting “Hide (strict detection)”" and not uploading the replacement files.

This has matched some strings that make some translations not right - because of the things that were not matched. So for example if the delimiter for notes is ::: then some of these effectively got added twice by the new string matching.

This is a problem because there are many hundreds of files, and I don’t want to check all of them.

I’m going to try fix it by running the process with renamed files etc as originally suggested. The reason I didn’t do that, was because it was indicated above that doing a github sync might do the same job as an upload for a file that had been renamed in Crowdin. It doesn’t.

hamishwillee · November 21, 2024, 4:10am

Another thing that is broken by this is that in english markdown I have &check; but this is appearing in the editor and being exported as &check; - which is not the same thing. This has only happened since enabling “Hide (strict detection)” and re-syncing. Is there a setting for preserving HTML entities in markdown?

Tetiana · November 21, 2024, 4:52am

Hi!
If the HTML entities is translated correctly, it should be exported the same as it is in your source strings
Are you following steps that my colleague shared with you? It is important to re-upload your sources in order to apply the newest version of the MD parser to your files. So if you’re using the GitHub integration, it is required to re-upload files via the integration

If you don’t want to change the option for duplicates and rename files, it is possible to run the pre-translate via TM after you add new files to the project. All translations are saved in the Translation Memory, so it can be applied automatically to the new content

But still, we recommend changing the option for duplicates to “Hide" and wait until all translations are migrated

hamishwillee · November 21, 2024, 10:34pm

Hi @Tetiana

Firstly, thank you to all of you in thread for help. I’m sure this is mostly on me.

If the HTML entities is translated correctly, it should be exported the same as it is in your source strings

In source I have (literally) &check; which normally renders as ✓. This is appearing in crowdin as &check; for translation and also in the output. The field has no translation, so the evidence is you are not correct.

Are you following steps that my colleague shared with you? It is important to re-upload your sources in order to apply the newest version of the MD parser to your files.

I am now. I am renaming files to old_<original name>, uploading the source file, then deleting the old_<original name>. This appears to be working - though I won’t know until I’ve finished.

So if you’re using the GitHub integration, it is required to re-upload files via the integration

Doesn’t work. After renaming the files and re-running the integration/sync the file is not re-uploaded - tested this on two files and syncs. I suspect the integration works on storage ids, so it knows that old_<original name> is actually the same file and doesn’t bother uploading.

hamishwillee · November 21, 2024, 10:34pm

I am trying to automate this file rename but the API gives me errors. If you can advise that would be great. Specifically I am trying to edit the file name using a patch as per Crowdin API Reference (File-based)

My Python code is like this:

# Function to rename a file
def rename_file(file_id, new_name):
    url = f"{BASE_URL}projects/{PROJECT_ID}/files/{file_id}"
    headers = {
        "Authorization": f"Bearer {API_KEY}",
        "Content-Type": "application/json"
    }
    data = {
      "op": "replace",
      "path": "/name",
      "value": new_name
    }
    response = requests.patch(url, headers=headers, json=data)
    print(f"Rename: {file_id} to {new_name}: {response.status_code}")
    pprint.pprint(response.json())
    return response.json()

# Rename a file
rename_file(3754, 'old_index.md')

But if I do a patch I get the error

400
{'errors': [{'error': {'errors': [{'code': 'jsonPatchInvalid',
                                   'message': 'Invalid operation given'}],
                       'key': 'op'}}, ...

And if I change the patch to a put I get the error

400
{'errors': [{'error': {'errors': [{'code': 'unexpected',
                                   'message': "Field 'op' is unexpected"}],
                       'key': 'op'}},
            {'error': {'errors': [{'code': 'unexpected',
                                   'message': "Field 'path' is unexpected"}],
                       'key': 'path'}},
            {'error': {'errors': [{'code': 'unexpected',
                                   'message': "Field 'value' is unexpected"}],
                       'key': 'value'}}]}

The format of the operation with op seems to match the docs, so I’m confused.

Dima · November 22, 2024, 3:46am

Hello @hamishwillee

Correct. If you’re using the same branch and just renaming files, integration would update the old file with a new version (maybe a couple of new strings, a new file name, etc.). It’s needed to re-upload a file, i.e., either upload a copy of the file or delete the old file and upload it once more.

If it’s a Github integration I’d suggest just copying all files in the new branch, and editing the integration (i.e. connecting the new branch, so you have 2 in total, old and new). The new files will be uploaded with the new branch, and the old ones can be safely deleted.

Please activate Duplicates before that, so translation (when possible) can migrate from old to new strings.

In source I have (literally) &check; which normally renders as ✓. This is appearing in crowdin as &check; for translation and also in the output. The field has no translation, so the evidence is you are not correct.

During the export we re-build the file, so technically this can happen. I assume &check is not the only way to “show” a check mark (at least, depending on the tool/system), and this can be solved with a few strings of code (this app can replace content after file export but before translation download)