长话短说,你不知道图像会被压缩到什么程度,因为这在很大程度上取决于它是什么类型的图像。也就是说,我们可以优化您的代码。
一些优化:
- 使用内存大小和图像宽度估算每个像素的字节数。
- 根据新的内存消耗和旧的内存消耗执行更新比率。
我的编码解决方案同时应用了上述两种方法,因为单独应用它们似乎不会导致非常稳定的收敛。以下部分将更深入地解释这两个部分并展示我考虑过的测试用例。
减少图像内存
以下代码根据原始文件大小(以字节为单位)和首选文件大小(以字节为单位)之间的差异来近似新的图像尺寸。它将近似每个像素的字节数,然后在图像宽度和高度上应用每个像素的原始字节数和每个像素的首选字节数之间的差异(因此取平方根)。
然后我使用opencv-python (cv2) 进行图像重新缩放,但这可以通过您的代码进行更改。
def reduce_image_memory(path, max_file_size: int = 2 ** 20):
"""
Reduce the image memory by downscaling the image.
:param path: (str) Path to the image
:param max_file_size: (int) Maximum size of the file in bytes
:return: (np.ndarray) downscaled version of the image
"""
image = cv2.imread(path)
height, width = image.shape[:2]
original_memory = os.stat(path).st_size
original_bytes_per_pixel = original_memory / np.product(image.shape[:2])
# perform resizing calculation
new_bytes_per_pixel = original_bytes_per_pixel * (max_file_size / original_memory)
new_bytes_ratio = np.sqrt(new_bytes_per_pixel / original_bytes_per_pixel)
new_width, new_height = int(new_bytes_ratio * width), int(new_bytes_ratio * height)
new_image = cv2.resize(image, (new_width, new_height), interpolation=cv2.INTER_LINEAR_EXACT)
return new_image
申请比例
大部分魔法发生在ratio *= max_file_size / new_memory,我们计算与首选尺寸相关的误差,并使用该值纠正我们的比率。
程序将搜索满足以下条件的比率:
abs(1 - max_file_size / new_memory) > max_deviation_percentage
这意味着新文件大小必须相对接近首选文件大小。您可以通过delta 控制此亲密度。增量越高,您的文件可以越小(低于max_file_size)。增量越小,新文件大小就越接近max_file_size,但永远不会变大。
的交易是及时的,delta 越小,找到满足条件的比率所需的时间就越多,经验测试表明 0.01 和 0.05 之间的值是好的。
if __name__ == '__main__':
image_location = "test img.jpg"
# delta denotes the maximum variation allowed around the max_file_size
# The lower the delta the more time it takes, but the close it will be to `max_file_size`.
delta = 0.01
max_file_size = 2 ** 20 * (1 - delta)
max_deviation_percentage = delta
current_memory = new_memory = os.stat(image_location).st_size
ratio = 1
steps = 0
# make sure that the comparison is within a certain deviation.
while abs(1 - max_file_size / new_memory) > max_deviation_percentage:
new_image = reduce_image_memory(image_location, max_file_size=max_file_size * ratio)
cv2.imwrite(f"resize {image_location}", new_image)
new_memory = os.stat(f"resize {image_location}").st_size
ratio *= max_file_size / new_memory
steps += 1
print(f"Memory resize: {current_memory / 2 ** 20:5.2f}, {new_memory / 2 ** 20:6.4f} MB, number of steps {steps}")
测试用例
为了测试,我有两种不同的方法,使用随机生成的图像和来自 google 的示例。
对于随机图像,我使用了以下代码
def generate_test_image(ratio: Tuple[int, int], file_size: int) -> Image:
"""
Generate a test image with fixed width height ratio and an approximate size.
:param ratio: (Tuple[int, int]) screen ratio for the image
:param file_size: (int) Approximate size of the image, note that this may be off due to image compression.
"""
height, width = ratio # Numpy reverse values
scale = np.int(np.sqrt(file_size // (width * height)))
img = np.random.randint(0, 255, (width * scale, height * scale, 3), dtype=np.uint8)
return img
结果
image_location = "test image random.jpg"
# Generate a large image with fixed ratio and a file size of ~1.7MB
image = generate_test_image(ratio=(16, 9), file_size=1531494)
cv2.imwrite(image_location, image)
内存调整大小:1.71、0.99 MB、步数 2
分两步,它将原始大小从 1.7 MB 减少到 0.99 MB。
(之前)
(之后)
内存调整大小:1.51,0.996 MB,步数 4
它通过 4 个步骤将原始大小从 1.51 MB 减少到 0.996 MB。
(之前)
(之后)
奖金
- 它也适用于
.png、.jpeg、.tiff 等...
- 除了缩小之外,它还可以用于将图像放大到一定的内存消耗。
- 尽可能保持图像比例。
编辑
我使代码更加用户友好,并使用io.Buffer 添加了来自Mark Setchell 的建议,这大致将代码加速了2 倍。还有一个step_limit,可以防止无休止如果 delta 非常小,则循环。
import io
import os
import time
from typing import Tuple
import cv2
import numpy as np
from PIL import Image
def generate_test_image(ratio: Tuple[int, int], file_size: int) -> Image:
"""
Generate a test image with fixed width height ratio and an approximate size.
:param ratio: (Tuple[int, int]) screen ratio for the image
:param file_size: (int) Approximate size of the image, note that this may be off due to image compression.
"""
height, width = ratio # Numpy reverse values
scale = np.int(np.sqrt(file_size // (width * height)))
img = np.random.randint(0, 255, (width * scale, height * scale, 3), dtype=np.uint8)
return img
def _change_image_memory(path, file_size: int = 2 ** 20):
"""
Tries to match the image memory to a specific file size.
:param path: (str) Path to the image
:param file_size: (int) Size of the file in bytes
:return: (np.ndarray) rescaled version of the image
"""
image = cv2.imread(path)
height, width = image.shape[:2]
original_memory = os.stat(path).st_size
original_bytes_per_pixel = original_memory / np.product(image.shape[:2])
# perform resizing calculation
new_bytes_per_pixel = original_bytes_per_pixel * (file_size / original_memory)
new_bytes_ratio = np.sqrt(new_bytes_per_pixel / original_bytes_per_pixel)
new_width, new_height = int(new_bytes_ratio * width), int(new_bytes_ratio * height)
new_image = cv2.resize(image, (new_width, new_height), interpolation=cv2.INTER_LINEAR_EXACT)
return new_image
def _get_size_of_image(image):
# Encode into memory and get size
buffer = io.BytesIO()
image = Image.fromarray(image)
image.save(buffer, format="JPEG")
size = buffer.getbuffer().nbytes
return size
def limit_image_memory(path, max_file_size: int, delta: float = 0.05, step_limit=10):
"""
Reduces an image to the required max file size.
:param path: (str) Path to the original (unchanged) image.
:param max_file_size: (int) maximum size of the image
:param delta: (float) maximum allowed variation from the max file size.
This is a value between 0 and 1, relatively to the max file size.
:return: an image path to the limited image.
"""
start_time = time.perf_counter()
max_file_size = max_file_size * (1 - delta)
max_deviation_percentage = delta
new_image = None
current_memory = new_memory = os.stat(image_location).st_size
ratio = 1
steps = 0
while abs(1 - max_file_size / new_memory) > max_deviation_percentage:
new_image = _change_image_memory(path, file_size=max_file_size * ratio)
new_memory = _get_size_of_image(new_image)
ratio *= max_file_size / new_memory
steps += 1
# prevent endless looping
if steps > step_limit: break
print(f"Stats:"
f"\n\t- Original memory size: {current_memory / 2 ** 20:9.2f} MB"
f"\n\t- New memory size : {new_memory / 2 ** 20:9.2f} MB"
f"\n\t- Number of steps {steps}"
f"\n\t- Time taken: {time.perf_counter() - start_time:5.3f} seconds")
if new_image is not None:
cv2.imwrite(f"resize {path}", new_image)
return f"resize {path}"
return path
if __name__ == '__main__':
image_location = "your nice image.jpg"
# Uncomment to generate random test images
# test_image = generate_test_image(ratio=(16, 9), file_size=1567289)
# cv2.imwrite(image_location, test_image)
path = limit_image_memory(image_location, max_file_size=2 ** 20, delta=0.01)