Slow GitHub synchronisation, request use --depth 1

paloeka · September 6, 2023, 12:24pm

In large GitHub repositories the synchronisation becomes much slower.

This seems like Crowdin’s GitHub Integration just git clones the entire repository.

As most devs know if they’ve read up on Git, git clones are able to minimize the download size by adding the --depth 1 flag.

I’ll tried asking support but they just accept it as a fact, not an issue.

IMO the current integration adds a waste of resources (server, time, bandwidth) that negatively impacts both Crowdin and their Projects.

Possible counter argument

The argument could be made that Crowdin needs to compare the current state with that of the previous state: but even then Git allows pulling individual commits.
Putting developer time into this is a waste for such a small edge case: Not only would this lower server costs, the users will appreciate a much faster response from git pushes, improving unseen quality of life metrics.

Olena · September 6, 2023, 1:06pm

Hi paloeka,

Seems there was a misunderstanding,

The integration uses the GitHub API to work with commits, it does not clone the repo.

If you have a large project, then it could be that the export of the translations takes the most time,

You can use our GitHub Actions. GitHub - crowdin/github-action: A GitHub action to manage and synchronize localization resources with your Crowdin project

name: Crowdin Action

on:
  push:
    branches: [ main ]

jobs:
  synchronize-with-crowdin:
    runs-on: ubuntu-latest

steps:
  - name: Checkout
    uses: actions/checkout@v3

  - name: crowdin action
    uses: crowdin/github-action@v1
    with:
      upload_sources: true
      upload_translations: true
      download_translations: true
      localization_branch_name: l10n_crowdin_translations
      create_pull_request: true
      pull_request_title: 'New Crowdin Translations'
      pull_request_body: 'New Crowdin translations by [Crowdin GH Action](https://github.com/crowdin/github-action)'
      pull_request_base_branch_name: 'main'
    env:
      GITHUB_TOKEN: ${{ secrets.GH_TOKEN }}
      CROWDIN_PROJECT_ID: ${{ secrets.CROWDIN_PROJECT_ID }}
      CROWDIN_PERSONAL_TOKEN: ${{ secrets.CROWDIN_PERSONAL_TOKEN }}

uses: actions/checkout@v3just does --depth 1 under the hood

Hope this will shed light on the matter!

paloeka · September 6, 2023, 8:25pm

Thanks I’ll test out using github actions

If you could send this along to your devs:

As you said that Crowdin’s “GitHub integration” uses their api (not git) to retrieve the newest translation file, I tried it out myself & personally did not have the same slowdowns as that of Crowdin;

The only slowdown I perceived was that of getting to a file multiple directories down, but that would not be dependent on the repositories’ total size.

To get the branch’s (e.g. main) latest commit: docs.github. com/en/rest/git/refs?apiVersion=2022-11-28#get-a-reference
From there you get the root tree from that commit: docs.github. com/en/rest/git/commits?apiVersion=2022-11-28#get-a-commit-object
Follow the tree down to the file you what to download: Git trees - GitHub Docs
From the tree you can specifically download files: Git blobs - GitHub Docs

example.sh

#!/bin/sh

# Retrieve the latest commit
################################################################

OWNER="the-clothing-loop"
REPO="website"
REF="heads/main"

gh api \
  -H "Accept: application/vnd.github+json" \
  -H "X-GitHub-Api-Version: 2022-11-28" \
  /repos/$OWNER/$REPO/git/ref/$REF

# Get the root tree from that commit
################################################################

COMMIT_SHA="10ea0f1b54477d863a37c2f26f2a491787891858"

gh api \
  -H "Accept: application/vnd.github+json" \
  -H "X-GitHub-Api-Version: 2022-11-28" \
  /repos/$OWNER/$REPO/git/commits/$COMMIT_SHA

# Go down the tree
################################################################

# /
TREE_SHA="ae93d5b2372fcf748e55c2d5b92ba45871c3dbeb"

gh api \
  -H "Accept: application/vnd.github+json" \
  -H "X-GitHub-Api-Version: 2022-11-28" \
  /repos/$OWNER/$REPO/git/trees/$TREE_SHA

# /frontend
TREE_SHA="780c350b08fec33e8203b657880ac70756e78190"

gh api \
  -H "Accept: application/vnd.github+json" \
  -H "X-GitHub-Api-Version: 2022-11-28" \
  /repos/$OWNER/$REPO/git/trees/$TREE_SHA

# /frontend/public
TREE_SHA="5cfb5635e0cc5d166b981d0ad85097031968096a"

gh api \
  -H "Accept: application/vnd.github+json" \
  -H "X-GitHub-Api-Version: 2022-11-28" \
  /repos/$OWNER/$REPO/git/trees/$TREE_SHA

# /frontend/public/locales
TREE_SHA="7c241904b014cb5990300df49e8d14ffdee9173e"

gh api \
  -H "Accept: application/vnd.github+json" \
  -H "X-GitHub-Api-Version: 2022-11-28" \
  /repos/$OWNER/$REPO/git/trees/$TREE_SHA

# /frontend/public/locales/en
TREE_SHA="c112e94b4c84edd987ddad55474cd3b69fd120bf"

gh api \
  -H "Accept: application/vnd.github+json" \
  -H "X-GitHub-Api-Version: 2022-11-28" \
  /repos/$OWNER/$REPO/git/trees/$TREE_SHA

# Download the file
################################################################

# /frontend/public/locales/en/translation.json
BLOB_SHA="3ac5e822ec6ba84e80dc9d5a961a15e96085fbe5"

gh api \
  -H "Accept: application/vnd.github+json" \
  -H "X-GitHub-Api-Version: 2022-11-28" \
  /repos/$OWNER/$REPO/git/blobs/$BLOB_SHA

Olena · September 6, 2023, 8:49pm

Hi paloeka,

Thank you! I will pass it on to our developers as a suggestion

Best ,