开发者问题收集

包含特殊字符的文件的 GitHub Rest Api SHA

2023-01-12
172

主要问题

此代码有效,但 areReallyEqual 应该不是必需的。事实上,对于不包含特殊字符(“ñ”、“é”、“à”等)的文件,它 不是 必需的,因此:除了其中 2 个文件外,我的所有文件都是必需的。

if (internalFileTextSha !== repositoryFile.sha) {
  // The sha might not be equal due to special characters in the text ("ñ", "é", "à", etc)... doing a second check...
  const remoteFileText = this._gitHubApi.fetchGitHubGetUrl(repositoryFile.download_url).getContentText(); 
  const remoteFileTextSha = GitHubCrypto.getSha(remoteFileText);
  const areReallyEqual = remoteFileTextSha === internalFileTextSha;
  if(!areReallyEqual) {
    // The file has been modified, creating a new commit with the updated file...
    this._gitHubApi.commitUpdatedFile(repositoryName, repositoryFile, internalFile.source);
  } else {
    // The second check determined that both files were equal
  }         
}

更多详细信息

internalFile 来自此处:

  /**
  * @typedef {Object} InternalFile
  * @property {string} id - The ID of the internal file.
  * @property {string} name - The name of the internal file.
  * @property {string} type - The type of the internal file.
  * @property {string} source - The source code of the internal file.
  */

  /**
   * Gets all the internal files of a Google Apps Script file.
   *
   * @returns {InternalFile[]} An array of objects.
   */
  static getScriptInternalFiles(file) {
    // Check that the file is a Google Apps Script file
    if (file.getMimeType() == 'application/vnd.google-apps.script') {
      // Get the script content as a string
      const fileId = file.getId();
      const params = {
        headers: { 
          Authorization: 'Bearer ' + ScriptApp.getOAuthToken(),
          'Accept-Charset': 'utf-8'
        },
        followRedirects: true,
        muteHttpExceptions: true,
      };
      const url =
        'https://script.google.com/feeds/download/export?id='
        + fileId
        + '&format=json';
      const response = UrlFetchApp.fetch(url, params);
      const json = JSON.parse(response);
      return json.files;  
    } else {
      throw new Error("The file is not a Google Apps Script file.");
    }
  }

...而 remoteFile 来自此处:

  /**
  * @typedef {Object} RepositoryFile
  * @property {string} name - The name of the file.
  * @property {string} path - The file path in the repository.
  * @property {string} sha - The SHA hash of the file.
  * @property {Number} size - The size of the file in bytes.
  * @property {string} url - The URL of the file's contents.
  * @property {string} html_url - The URL to the file's page on the repository website.
  * @property {string} git_url - The URL to the file's git contents.
  * @property {string} download_url - The URL to download the file.
  * @property {string} type - The type of the file, usually "file".
  */

  /**
   * Gets all the internal files of a Google Apps Script file.
   *
   * @returns {RepositoryFile[]} An array of objects.
   */
  listFilesInRepository(repositoryName) {
    let repositoryFiles = [];
    try {
      const options = {
        headers: {
          ...this._authorizationHeader,
        },
      };      
      const response = UrlFetchApp.fetch(`${this._buildRepositoryUrl(repositoryName)}/contents`, options);
      repositoryFiles = JSON.parse(response);
    } catch(e) {
      const errorMessage = GitHubApi._getErrorMessage(e);
      if(errorMessage === 'This repository is empty.') {
        // do nothing
      } else {
        // unknown error
        throw e;
      }
    }
    return repositoryFiles;
  }

...和 ​​SHA 计算:

class GitHubCrypto {
  /**
   * @param {string} fileContent
   * @returns {string} SHA1 hash string
   */
  static getSha(fileContent) {
    // GitHub is computing the sum of `blob <length>\x00<contents>`, where `length` is the length in bytes of the content string and `\x00` is a single null byte.
    // For the Sha1 implementation, see: www.movable-type.co.uk/scripts/sha1.html
    const sha = Sha1.hash('blob ' + fileContent.length + '\x00' + fileContent);
    return sha;
  }
}
2个回答

我相信您的目标如下。

  • 您想使用 Google Apps Script 从 ñèàü 的值中检索 430f370909821443112064cd149a4bebd271bfc4 的 has 值。

在这种情况下,以下示例脚本怎么样?

示例脚本:

function myFunction() {
  const fileContent = 'ñèàü'; // This is from your comment.

  const value = 'blob ' + fileContent.length + '\x00' + fileContent;
  const bytes = Utilities.computeDigest(Utilities.DigestAlgorithm.SHA_1, value, Utilities.Charset.UTF_8);
  const res = bytes.map(byte => ('0' + (byte & 0xFF).toString(16)).slice(-2)).join('');
  console.log(res); // 430f370909821443112064cd149a4bebd271bfc4
}
  • 运行此脚本时,将获得 430f370909821443112064cd149a4bebd271bfc4 的 has 值。
  • 当直接使用 aaaa 时,如 const res = Utilities.computeDigest(Utilities.DigestAlgorithm.SHA_1, "aaaa", Utilities.Charset.UTF_8).map(byte => ('0' + (byte & 0xFF).toString(16)).slice(-2)).join('') ,得到 70c881d4a26984ddce795f6f71817c9cf4480e79

参考:

Tanaike
2023-01-12

我终于找到了问题所在……问题出在“字节长度”上:

GitHub 正在计算 blob <length>\x00<contents> 的总和,其中 length 是内容字符串的字节长度,而 \x00 是单个空字节。

当没有特殊字符时, lengthInBytes == inputStr.length ,但当有特殊字符时,情况不再如此:

// For the Sha1 implementation, see: www.movable-type.co.uk/scripts/sha1.html

function simpleShaTest1() {
  const fileContent = 'ñèàü';
  const expectedSha = '56c9357fcf2589619880e1978deb8365454ece11';
  const sha = Sha1.hash('blob ' + byteLength(fileContent) + '\x00' + fileContent);
  const areEqual = (sha === expectedSha); // true
}

function simpleShaTest2() {
  const fileContent = 'aaa';
  const expectedSha = '7c4a013e52c76442ab80ee5572399a30373600a2';
  const sha = Sha1.hash('blob ' + lengthInUtf8Bytes(fileContent) + '\x00' + fileContent);
  const areEqual = (sha === expectedSha); // true
}

function lengthInUtf8Bytes(str) {
  // Matches only the 10.. bytes that are non-initial characters in a multi-byte sequence.
  var m = encodeURIComponent(str).match(/%[89ABab]/g);
  // `m` is `null` when there are no special characters, thus returning `m.length`
  return str.length + (m ? m.length : 0);
}
Xavier Peña
2023-01-12