开发者问题收集

zipfile.BadZipFile:提取受密码保护的 .zip 时出现错误 CRC-32,而 .zip 在提取时损坏

2019-02-05
5580

我正在尝试提取一个受密码保护的 .zip,其中包含一个 .txt 文档(在本例中为 Congrats.txt )。现在 Congrats.txt 中有文本,因此其大小不为 0kb。它被放置在一个 .zip 中(为了便于讨论,我们将其命名为 zipv1.zip ),密码为 dominique 。该密码与另一个 .txt 文件内的其他单词和名称一起存储(为了回答这个问题,我们将其命名为 file.txt )。

现在,如果我通过执行 python Program.py -z zipv1.zip -f file.txt 来运行下面的代码(假设所有这些文件都与 Program.py 位于同一文件夹中),我的程序会在 file.txt 中的其他单词/密码中显示 dominique 作为 zipv1.zip 的正确密码,并提取 zipv1.zip ,但 Congrats.txt 为空且大小为 0kb。

现在我的代码如下:

import argparse
import multiprocessing
import zipfile

parser = argparse.ArgumentParser(description="Unzips a password protected .zip", usage="Program.py -z zip.zip -f file.txt")
# Creates -z arg
parser.add_argument("-z", "--zip", metavar="", required=True, help="Location and the name of the .zip file.")
# Creates -f arg
parser.add_argument("-f", "--file", metavar="", required=True, help="Location and the name of file.txt.")
args = parser.parse_args()


def extract_zip(zip_filename, password):
    try:
        zip_file = zipfile.ZipFile(zip_filename)
        zip_file.extractall(pwd=password)
        print(f"[+] Password for the .zip: {password.decode('utf-8')} \n")
    except:
        # If a password fails, it moves to the next password without notifying the user. If all passwords fail, it will print nothing in the command prompt.
        pass


def main(zip, file):
    if (zip == None) | (file == None):
        # If the args are not used, it displays how to use them to the user.
        print(parser.usage)
        exit(0)
    # Opens the word list/password list/dictionary in "read binary" mode.
    txt_file = open(file, "rb")
    # Allows 8 instances of Python to be ran simultaneously.
    with multiprocessing.Pool(8) as pool:
        # "starmap" expands the tuples as 2 separate arguments to fit "extract_zip"
        pool.starmap(extract_zip, [(zip, line.strip()) for line in txt_file])


if __name__ == '__main__':
    main(args.zip, args.file)

但是,如果我使用与 zipv1.zip 唯一的区别是 Congrats.txt 位于与 Congrats.txt 一起压缩的文件夹中,我得到了与 zipv1.zip 相同的结果,但这次 Congrats.txt 沿着它所在的文件夹提取,并且 Congrats.txt 是完整的;其中的文本和大小都完整无缺。

因此,为了解决这个问题,我尝试阅读 zipfile 的文档 ,在那里我发现如果密码与 .zip 不匹配,它会抛出 RuntimeError 。因此,我将代码中的 except: 更改为 except RuntimeError: ,并在尝试解压 zipv1.zip 时收到此错误:

(venv) C:\Users\USER\Documents\Jetbrains\PyCharm\Program>Program.py -z zipv1.zip -f file.txt
[+] Password for the .zip: dominique

multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
  File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\multiprocessing\pool.py", line 121, in worker
result = (True, func(*args, **kwds))
  File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\multiprocessing\pool.py", line 47, in starmapstar
return list(itertools.starmap(args[0], args[1]))
  File "C:\Users\USER\Documents\Jetbrains\PyCharm\Program\Program.py", line 16, in extract_zip
zip_file.extractall(pwd=password)
  File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\zipfile.py", line 1594, in extractall
self._extract_member(zipinfo, path, pwd)
  File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\zipfile.py", line 1649, in _extract_member
shutil.copyfileobj(source, target)
  File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\shutil.py", line 79, in copyfileobj
buf = fsrc.read(length)
  File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\zipfile.py", line 876, in read
data = self._read1(n)
  File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\zipfile.py", line 966, in _read1
self._update_crc(data)
  File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\zipfile.py", line 894, in _update_crc
raise BadZipFile("Bad CRC-32 for file %r" % self.name)
zipfile.BadZipFile: Bad CRC-32 for file 'Congrats.txt'
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "C:\Users\USER\Documents\Jetbrains\PyCharm\Program\Program.py", line 38, in <module>
main(args.zip, args.file)
  File "C:\Users\USER\Documents\Jetbrains\PyCharm\Program\Program.py", line 33, in main
pool.starmap(extract_zip, [(zip, line.strip()) for line in txt_file])
  File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\multiprocessing\pool.py", line 276, in starmap
return self._map_async(func, iterable, starmapstar, chunksize).get()
  File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\multiprocessing\pool.py", line 657, in get
raise self._value
zipfile.BadZipFile: Bad CRC-32 for file 'Congrats.txt'

但结果相同;在 file.txt 中找到了密码, zipv1.zip 已解压,但 Congrats.txt 为空且大小为 0kb。因此,我再次运行该程序,但这次针对的是 zipv2.zip ,结果如下:

(venv) C:\Users\USER\Documents\Jetbrains\PyCharm\Program>Program.py -z zipv2.zip -f file.txt
[+] Password for the .zip: dominique

multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
  File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\multiprocessing\pool.py", line 121, in worker
result = (True, func(*args, **kwds))
  File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\multiprocessing\pool.py", line 47, in starmapstar
return list(itertools.starmap(args[0], args[1]))
  File "C:\Users\USER\Documents\Jetbrains\PyCharm\Program\Program.py", line 16, in extract_zip
zip_file.extractall(pwd=password)
  File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\zipfile.py", line 1594, in extractall
self._extract_member(zipinfo, path, pwd)
  File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\zipfile.py", line 1649, in _extract_member
shutil.copyfileobj(source, target)
  File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\shutil.py", line 79, in copyfileobj
buf = fsrc.read(length)
  File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\zipfile.py", line 876, in read
data = self._read1(n)
  File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\zipfile.py", line 966, in _read1
self._update_crc(data)
  File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\zipfile.py", line 894, in _update_crc
raise BadZipFile("Bad CRC-32 for file %r" % self.name)
zipfile.BadZipFile: Bad CRC-32 for file 'Congrats.txt'
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "C:\Users\USER\Documents\Jetbrains\PyCharm\Program\Program.py", line 38, in <module>
main(args.zip, args.file)
  File "C:\Users\USER\Documents\Jetbrains\PyCharm\Program\Program.py", line 33, in main
pool.starmap(extract_zip, [(zip, line.strip()) for line in txt_file])
  File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\multiprocessing\pool.py", line 276, in starmap
return self._map_async(func, iterable, starmapstar, chunksize).get()
  File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\multiprocessing\pool.py", line 657, in get
raise self._value
zipfile.BadZipFile: Bad CRC-32 for file 'Congrats.txt'

再次,结果相同;其中文件夹已成功提取,并且 Congrats.txt 也已提取,其中包含文本,并且其大小完好无损。

我确实看了 这个 类似的线程,以及 这个 线程,但它们没有帮助。我还查看了 zipfile 的文档 ,但对于该问题没有帮助。

编辑

现在,在实现 with zipfile.ZipFile(zip_filename, 'r') as zip_file: 之后,由于某些未知且奇怪的原因,程序可以读取/处理较小的单词列表/密码列表/词典,但如果它很大则不能(?)。

我的意思是,假设 zipv1.zip 中有一个 .txt 文档;名为 Congrats.txt ,文本为 You have cracked the .zip! 。相同的 .txt 也存在于 zipv2.zip 中,但这次放在名为 ZIP Contents 的文件夹中,然后压缩/受密码保护。两个 zip 的密码都是 dominique

请注意,每个 .zip 都是使用 Deflate 压缩方法和 7zip 中的 ZipCrypto 加密生成的。

现在,密码位于 第 35 行 (35/52 行) John The Ripper Jr.txt 中,以及位于 John The Ripper.txt第 1968 行 (1968/3106 行)。

现在,如果您在 CMD (或您选择的 IDE) 中执行 python Program.py -z zipv1 -f "John The Ripper Jr.txt" ;它将创建一个名为 Extracted 的文件夹,并将 Congrats.txt 和我们之前设置的句子放在其中。 zipv2 也是如此,但 Congrats.txt 将位于 ZIP Contents 文件夹中,该文件夹位于 Extracted 文件夹中。在此实例中,提取 .zip 文件没有任何问题。

但是,如果您在 CMD(或您选择的 IDE)中对 John The Ripper.txt 尝试执行相同操作,即 python Program.py -z zipv1 -f "John The Ripper.txt" ,它将为两个 zip 文件创建 Extracted 文件夹;就像 John The Ripper Jr.txt 一样,但这次由于某些未知原因, Congrats.txt 对于他们俩来说都是 空的

我的代码和所有必要的文件如下:

import argparse
import multiprocessing
import zipfile

parser = argparse.ArgumentParser(description="Unzips a password protected .zip by performing a brute-force attack.", usage="Program.py -z zip.zip -f file.txt")
# Creates -z arg
parser.add_argument("-z", "--zip", metavar="", required=True, help="Location and the name of the .zip file.")
# Creates -f arg
parser.add_argument("-f", "--file", metavar="", required=True, help="Location and the name of the word list/password list/dictionary.")
args = parser.parse_args()


def extract_zip(zip_filename, password):
    try:
        with zipfile.ZipFile(zip_filename, 'r') as zip_file:
            zip_file.extractall('Extracted', pwd=password)
            print(f"[+] Password for the .zip: {password.decode('utf-8')} \n")
    except:
        # If a password fails, it moves to the next password without notifying the user. If all passwords fail, it will print nothing in the command prompt.
        pass


def main(zip, file):
    if (zip == None) | (file == None):
        # If the args are not used, it displays how to use them to the user.
        print(parser.usage)
        exit(0)
    # Opens the word list/password list/dictionary in "read binary" mode.
    txt_file = open(file, "rb")
    # Allows 8 instances of Python to be ran simultaneously.
    with multiprocessing.Pool(8) as pool:
        # "starmap" expands the tuples as 2 separate arguments to fit "extract_zip"
        pool.starmap(extract_zip, [(zip, line.strip()) for line in txt_file])


if __name__ == '__main__':
    # Program.py - z zipname.zip -f filename.txt
    main(args.zip, args.file)

Program.py

zipv1.zip

zipv2.zip

John The Ripper Jr.txt

John The Ripper.txt

John The Ripper v2.txt

我不确定为什么会发生这种情况,也无法在任何地方找到该问题的答案。据我所知,这完全是未知的,我找不到调试或解决此问题的方法。

无论使用哪种单词/密码列表,这种情况都会继续发生。尝试使用相同的 Congrats.txt 生成更多 .zip 文件,但使用来自不同单词列表/密码列表/词典的不同密码。使用相同方法;使用较大和较小版本的 .txt 文件,并实现与上述相同的结果。

我发现,如果我从 John The Ripper.txt 中剪切出前 2k 个单词并创建一个新的 .txt 文件;例如 John The Ripper v2.txt ;则 .zip 文件被成功提取, Extracted 文件夹出现,并且 Congrats.txt 及其中的文本出现。所以我相信它与密码后的行有关。所以在这种情况下 Line 1968 ;脚本不会在 Line 1968 之后停止?但我不确定为什么这样做有效。我想这不是一个解决方案,但是是迈向解决方案的一步...

编辑 2

因此,我尝试使用“池终止”代码:

import argparse
import multiprocessing
import zipfile

parser = argparse.ArgumentParser(description="Unzips a password protected .zip by performing a brute-force attack using", usage="Program.py -z zip.zip -f file.txt")
# Creates -z arg
parser.add_argument("-z", "--zip", metavar="", required=True, help="Location and the name of the .zip file.")
# Creates -f arg
parser.add_argument("-f", "--file", metavar="", required=True, help="Location and the name of the word list/password list/dictionary.")
args = parser.parse_args()


def extract_zip(zip_filename, password, queue):
    try:
        with zipfile.ZipFile(zip_filename, "r") as zip_file:
            zip_file.extractall('Extracted', pwd=password)
            print(f"[+] Password for the .zip: {password.decode('utf-8')} \n")
            queue.put("Done")  # Signal success
    except:
        # If a password fails, it moves to the next password without notifying the user. If all passwords fail, it will print nothing in the command prompt.
        pass


def main(zip, file):
    if (zip == None) | (file == None):
        print(parser.usage)  # If the args are not used, it displays how to use them to the user.
        exit(0)
    # Opens the word list/password list/dictionary in "read binary" mode.
    txt_file = open(file, "rb")

    # Create a Queue
    manager = multiprocessing.Manager()
    queue = manager.Queue()

    with multiprocessing.Pool(8) as pool:  # Allows 8 instances of Python to be ran simultaneously.
        pool.starmap_async(extract_zip, [(zip, line.strip(), queue) for line in txt_file])  # "starmap" expands the tuples as 2 separate arguments to fit "extract_zip"
        pool.close()
        queue.get(True)  # Wait for a process to signal success
        pool.terminate()  # Terminate the pool
        pool.join()


if __name__ == '__main__':
    main(args.zip, args.file)  # Program.py -z zip.zip -f file.txt.

现在,如果我使用这个,两个 zip 文件都会成功提取,就像之前的实例一样。 但是 这次 zipv1.zipCongrats.txt 是完整的;里面有消息。但对于 zipv2.zip 来说,情况并非如此,因为它仍然是空的。

2个回答

抱歉,让您久等了……看来您陷入了困境。

回顾

  • 处理受密码保护的 .zip 文件

  • 尝试使用文件中的密码进行暴力破解 ( ciobaneste )

  • 正确的密码位于(上一步)文件中,但尽管如此,某些文件仍未正确提取

1. 调查

该场景非常复杂(我想说,距离 M CVE 还差得很远),有很多事情可以归咎于此行为。

zipv1.zip / zipv2.zip 不匹配开始。仔细观察,似乎 zipv2 也乱了套 。如果 zipv1 很容易发现( Congrats.txt 是唯一的文件),那么对于 zipv2 “ZIP Contents/Black-Large.png” 的大小为 0
它可以在任何文件中重现,并且更多: 它适用于 zf.namelist 返回的第一个条目(不是 dir )。

因此,事情开始变得更加清晰:

  • 文件内容正在解压,因为 dominique 存在于密码文件中(不知道到那时会发生什么)

  • 稍后, .zip 的第一个条目被截断为 0 字节

查看尝试使用错误密码提取文件时抛出的异常,有 3 种类型(其中最后 2 种可以分组一起):

  1. RuntimeError: 文件密码错误...

  2. 其他:

    • zlib.error: 解压数据时出现错误 -3...

    • zipfile.BadZipFile: 文件 CRC-32 错误...

我创建了一个自己的存档文件。为了保持一致性,我将从现在开始使用它,但所有内容也适用于任何其他文件。

  • 内容:

    • DummyFile0.zip 10 字节) - 包含: 0123456789

    • DummyFile1.zip em>10 字节) - 包含: 0000000000

    • DummyFile2.zip 10 字节) - 包含: AAAAAAAAAA

  • 使用 Total Commander 存档 3 个文件( v9.21 a) 内部 Zip 打包程序,使用 dominique 密码保护它( zip2.0 加密)。生成的存档(命名为 arc0.zip (但名称无关紧要))长度为 392 字节

code00.py

#!/usr/bin/env python

import os
import sys
import zipfile


def main(*argv):
    arc_name = argv[0] if argv else "./arc0.zip"
    pwds = (
        #b"dominique",
        #b"dickhead",
        b"coco",
    )
    #pwds = [item.strip() for item in open("orig/John The Ripper.txt.orig", "rb").readlines()]
    print("Unpacking (password protected: dominique) {:s},"
          " using a list of predefined passwords ...".format(arc_name))
    if not os.path.isfile(arc_name):
        raise SystemExit("Archive file must exist!\nExiting.")
    faulty_pwds = list()
    good_pwds = list()
    with zipfile.ZipFile(arc_name, "r") as zip_file:
        print("Zip names: {:}\n".format(zip_file.namelist()))
        for idx, pwd in enumerate(pwds):
            try:
                zip_file.extractall("Extracted", pwd=pwd)
            except:
                exc_cls, exc_inst, exc_tb = sys.exc_info()
                if exc_cls != RuntimeError:
                    print("Exception caught when using password ({:d}): [{:}] ".format(idx, pwd))
                    print("    {:}: {:}".format(exc_cls, exc_inst))
                    faulty_pwds.append(pwd)
            else:
                print("Success using password ({:d}): [{:}] ".format(idx, pwd))
                good_pwds.append(pwd)
            input()
    print("\nFaulty passwords: {:}\nGood passwords: {:}".format(faulty_pwds, good_pwds))


if __name__ == "__main__":
    print("Python {:s} {:03d}bit on {:s}\n".format(" ".join(elem.strip() for elem in sys.version.split("\n")),
                                                   64 if sys.maxsize > 0x100000000 else 32, sys.platform))
    rc = main(*sys.argv[1:])
    print("\nDone.\n")
    sys.exit(rc)

输出

[cfati@CFATI-5510-0:e:\Work\Dev\StackOverflow\q054532010]> "e:\Work\Dev\VEnvs\py_064_03.06.08_test0\Scripts\python.exe" ./code00.py ./arc0.zip
Python 3.6.8 (tags/v3.6.8:3c6b436a57, Dec 24 2018, 00:16:47) [MSC v.1916 64 bit (AMD64)] 064bit on win32

Unpacking (password protected: dominique) arc0.zip, using a list of predefined passwords ...
Zip names: ['DummyFile0.txt', 'DummyFile1.txt', 'DummyFile2.txt']

Exception caught when using password (1189): [b'mariah']
    <class 'zlib.error'>: Error -3 while decompressing data: invalid code lengths set
Exception caught when using password (1446): [b'zebra']
    <class 'zlib.error'>: Error -3 while decompressing data: invalid block type
Exception caught when using password (1477): [b'1977']
    <class 'zlib.error'>: Error -3 while decompressing data: invalid block type
Success using password (1967): [b'dominique']
Exception caught when using password (2122): [b'hank']
    <class 'zlib.error'>: Error -3 while decompressing data: invalid code lengths set
Exception caught when using password (2694): [b'solomon']
    <class 'zlib.error'>: Error -3 while decompressing data: invalid distance code
Exception caught when using password (2768): [b'target']
    <class 'zlib.error'>: Error -3 while decompressing data: invalid block type
Exception caught when using password (2816): [b'trish']
    <class 'zlib.error'>: Error -3 while decompressing data: invalid code lengths set
Exception caught when using password (2989): [b'coco']
    <class 'zlib.error'>: Error -3 while decompressing data: invalid stored block lengths

Faulty passwords: [b'mariah', b'zebra', b'1977', b'hank', b'solomon', b'target', b'trish', b'coco']
Good passwords: [b'dominique']

Done.

查看 ZipFile.extractall 代码,它会尝试提取所有成员。第一个代码引发异常,因此它的行为方式开始变得更加清晰。但是,当尝试使用 2 个错误密码提取项目时,为什么行为会有所不同?
正如在两种不同抛出异常类型的回溯中所看到的,答案位于 ZipFile.open 的末尾。

经过进一步调查,结果发现这是由于

2.冲突由 zip 加密弱点决定

根据 [UT.CS]: dmitri-report-f15-16.pdf - ZIP 文件中基于密码的加密 ((最后一个) 强调 是我的):

3.1 Traditional PKWARE encryption

The original encryption scheme, commonly referred to as the PKZIP cipher, was designed by Roger Schaffely [1]. In [5] Biham and Kocher showed that the cipher is weak and demonstrated an attack requiring 13 bytes of plaintext. Further attacks have been developed, some of which require no user provided plaintext at all [6]. The PKZIP cipher is essentially a stream cipher, i.e. input is encrypted by generating a pseudo- random key stream and XOR-ing it with the plaintext. The internal state of the cipher consists of three 32-bit words: key0 , key1 and key2 . These are initialized to 0x12345678 , 0x23456789 and 0x34567890 , respectively. A core step of the algorithm involves updating the three keys using a single byte of input...

...

Before encrypting a file in the archive, 12 random bytes are first prepended to its compressed contents and the resulting bytestream is then encrypted. Upon decryption, the first 12 bytes need to be discarded. According to the specification, this is done in order to render a plaintext attack on the data ineffective. The specification also states that out of the 12 prepended bytes, only the first 11 are actually random, the last byte is equal to the high order byte of the CRC-32 of the uncompressed contents of the file. This gives the ability to quickly verify whether a given password is correct by comparing the last byte of the decrypted 12 byte header to the high order byte of the actual CRC-32 value that is included in the local file header . This can be done before decrypting the rest of the file.

其他参考资料:

算法的弱点:由于区分仅基于 一个字节 ,对于 256 个不同的(经过精心挑选的) 错误密码,将 至少有一个密码会生成与正确密码相同的数字

该算法会丢弃大多数错误密码,但有些不会。

回顾:当尝试使用密码提取文件时:

  • 如果在文件密码的最后一个字节上计算的“哈希值”与文件 CRC 的高位字节不同,则会引发异常

  • 但是 ,如果它们相等:

    • 打开一个新文件流进行写入(如果已经清空文件现有的)

    • 尝试解压:

      • 对于错误的密码(已通过上述检查),解压将失败(但文件已被清空)

从上面的输出可以看出,对于我的( .zip )文件,有 8 个密码弄乱了它。请注意:

  • 对于每个存档文件,结果都不同

  • 成员 文件名和内容是相关的(至少对于第一个文件而言)。更改其中任何一个都会产生不同的结果(对于“相同”的存档文件)

以下是基于我的 .zip 文件中的数据进行的测试:

>>> import zipfile
>>>
>>> zd_coco = zipfile._ZipDecrypter(b"coco")
>>> zd_dominique = zipfile._ZipDecrypter(b"dominique")
>>> zd_other = zipfile._ZipDecrypter(b"other")
>>> cipher = b'\xd1\x86y ^\xd77gRzZ\xee'  # Member (1st) file cipher: 12 bytes starting from archive offset 44
>>>
>>> crc = 2793719750  # Member (1st) file CRC - archive bytes: 14 - 17
>>> hex(crc)
'0xa684c7c6'
>>> for zd in (zd_coco, zd_dominique, zd_other):
...     print(zd, [hex(zd(c)) for c in cipher])
...
<zipfile._ZipDecrypter object at 0x0000021E8DA2E0F0> ['0x1f', '0x58', '0x89', '0x29', '0x89', '0xe', '0x32', '0xe7', '0x2', '0x31', '0x70', '0xa6']
<zipfile._ZipDecrypter object at 0x0000021E8DA2E160> ['0xa8', '0x3f', '0xa2', '0x56', '0x4c', '0x37', '0xbb', '0x60', '0xd3', '0x5e', '0x84', '0xa6']
<zipfile._ZipDecrypter object at 0x0000021E8DA2E128> ['0xeb', '0x64', '0x36', '0xa3', '0xca', '0x46', '0x17', '0x1a', '0xfb', '0x6d', '0x6c', '0x4e']
>>>  # As seen, the last element of the first 2 arrays (coco and dominique) is 0xA6 (166), which is the same as the first byte of the CRC

我使用其他解压引擎(使用默认参数)进行了一些测试:

  1. WinRar :如果密码错误,文件不会受到影响,但如果密码错误,文件会被截断(与此处相同)

  2. 7-Zip :它会询问用户是否覆盖文件,并且无论解压结果如何,它都会覆盖文件

  3. Total Commander 的内部( Zip )解压器:与 #2.

3. 结论

  • 我认为这是 ZipFile 的一个错误。指定这样一个有缺陷的(和错误的)密码不应该覆盖现有文件(如果有的话)。或者至少,行为应该是一致的(对于所有错误的密码)

  • 快速浏览没有发现 Python 上的任何错误

  • 我没有看到简单的修复方法,因为:

    • 无法改进 Zip 算法(以更好地检查密码是否 正确

    • 我想到了一些修复方法,但它们要么会对性能产生负面影响,要么可能会在某些(极端)情况下引入回归

我已提交 [GitHub]: python/cpython - [3.6] bpo-36247: zipfile - 当提供错误密码时提取会截断(现有)文件(zip 加密弱点) 已针对分支 3.6 (处于 仅安全修复 模式)关闭。不确定其结果会是什么(在其他分支中),但无论如何,它不会很快可用(比如说在接下来的几个月内)。

作为替代方案,您可以下载补丁程序,并在本地应用更改。查看 [SO]:在 PyCharm 社区版中通过鼠标右键单击上下文菜单运行/调试 Django 应用程序的单元测试?(@CristiFati 的回答) 修补 UTRunner 部分)了解如何在 Win 上应用补丁(基本上,每行以 一个“+” 符号开头的行都会进入,每行以 一个“-” 符号开头的行都会退出)。
如果您想保持 Python 安装完好无损,您可以将 zipfile.py Python dir 复制到您的项目(或某个“个人”) dir 并修补该文件。

CristiFati
2019-03-08

我遇到了这个问题,经过一番研究发现,问题出在我为 zip 文件选择的加密方式上。我将 7-Zip 提供的默认加密方式 ZipCrypto 改为 AES-256,一切正常。

Robin Andrews
2020-07-04