如何用折痕、褶皱和皱纹来增加扫描的文档图像？答案

【问题标题】：How to augment scanned document image with creases, folds and wrinkles?如何用折痕、褶皱和皱纹来增加扫描的文档图像？
【发布时间】：2020-10-09 11:05:03
【问题描述】：

我正在创建一个合成数据集来训练一个需要在图像中查找文档的模型。文件将远非完美，即它们被折叠、起皱和起皱。

我可以在 Photoshop 中找到一些方法，但我想知道是否有人有更好的想法在 opencv 中进行这种增强，而无需尝试对 Photoshop 过程进行逆向工程。

例如（来自https://www.photoshopessentials.com/photo-effects/folds-creases/）：到：

或皱纹（来自https://www.myjanee.com/tuts/crumple/crumple.htm）：

【问题讨论】：

使用修复去除折痕
我想添加折痕线......
您可能想查看搅拌机的 python 管道以创建实际的 3d 扭曲。像这样：youtube.com/watch?v=M8_5S8Jq5uo，将您的图像投影到上面。使成为。完毕。可以通过批处理和随机改变噪声来完成。

标签： python opencv data-augmentation

【解决方案1】：

我试图将你所有的扭曲放在 Python/Opencv 中的一个脚本中。

输入：

皱纹：

import cv2
import numpy as np
import math
import skimage.exposure

# read desert car image and convert to float in range 0 to 1
img = cv2.imread('desert_car.png').astype("float32") / 255.0
hh, ww = img.shape[:2]

# read wrinkle image as grayscale and convert to float in range 0 to 1
wrinkles = cv2.imread('wrinkles.jpg',0).astype("float32") / 255.0

# resize wrinkles to same size as desert car image
wrinkles = cv2.resize(wrinkles, (ww,hh), fx=0, fy=0)

# apply linear transform to stretch wrinkles to make shading darker
#wrinkles = skimage.exposure.rescale_intensity(wrinkles, in_range=(0,1), out_range=(0,1)).astype(np.float32)

# shift image brightness so mean is (near) mid gray
mean = np.mean(wrinkles)
shift = mean - 0.4
wrinkles = cv2.subtract(wrinkles, shift)

# create folds image as diagonal grayscale gradient as float as plus and minus equal amount
hh1 = math.ceil(hh/2)
ww1 = math.ceil(ww/3)
val = math.sqrt(0.2)
grady = np.linspace(-val, val, hh1, dtype=np.float32)
gradx = np.linspace(-val, val, ww1, dtype=np.float32)
grad1 = np.outer(grady, gradx)

# flip grad in different directions
grad2 = cv2.flip(grad1, 0)
grad3 = cv2.flip(grad1, 1)
grad4 = cv2.flip(grad1, -1)

# concatenate to form folds image
foldx1 = np.hstack([grad1-0.1,grad2,grad3])
foldx2 = np.hstack([grad2+0.1,grad3,grad1+0.2])
folds = np.vstack([foldx1,foldx2])
#folds = (1-val)*folds[0:hh, 0:ww]
folds = folds[0:hh, 0:ww]

# add the folds image to the wrinkles image
wrinkle_folds = cv2.add(wrinkles, folds)

# draw creases as blurred lines on black background
creases = np.full((hh,ww), 0, dtype=np.float32)
ww2 = 2*ww1
cv2.line(creases, (0,hh1), (ww-1,hh1), 0.25, 1)
cv2.line(creases, (ww1,0), (ww1,hh-1),  0.25, 1)
cv2.line(creases, (ww2,0), (ww2,hh-1),  0.25, 1)

# blur crease image
creases = cv2.GaussianBlur(creases, (3,3), 0)

# add crease to wrinkles_fold image
wrinkle_folds_creases = cv2.add(wrinkle_folds, creases)

# threshold wrinkles and invert
thresh = cv2.threshold(wrinkle_folds_creases,0.7,1,cv2.THRESH_BINARY)[1]
thresh = cv2.cvtColor(thresh,cv2.COLOR_GRAY2BGR) 
thresh_inv = 1-thresh

# convert from grayscale to bgr 
wrinkle_folds_creases = cv2.cvtColor(wrinkle_folds_creases,cv2.COLOR_GRAY2BGR) 

# do hard light composite and convert to uint8 in range 0 to 255
# see CSS specs at https://www.w3.org/TR/compositing-1/#blendinghardlight
low = 2.0 * img * wrinkle_folds_creases
high = 1 - 2.0 * (1-img) * (1-wrinkle_folds_creases)
result = ( 255 * (low * thresh_inv + high * thresh) ).clip(0, 255).astype(np.uint8)

# save results
cv2.imwrite('desert_car_wrinkles_adjusted.jpg',(255*wrinkles).clip(0,255).astype(np.uint8))
cv2.imwrite('desert_car_wrinkles_folds.jpg', (255*wrinkle_folds).clip(0,255).astype(np.uint8))
cv2.imwrite('wrinkle_folds_creases.jpg', (255*wrinkle_folds_creases).clip(0,255).astype(np.uint8))
cv2.imwrite('desert_car_result.jpg', result)

# show results
cv2.imshow('wrinkles', wrinkles)
cv2.imshow('wrinkle_folds', wrinkle_folds)
cv2.imshow('wrinkle_folds_creases', wrinkle_folds_creases)
cv2.imshow('thresh', thresh)
cv2.imshow('result', result)
cv2.waitKey(0)
cv2.destroyAllWindows()

皱纹调整：

有褶皱的皱纹：

褶皱和折痕：

结果：

【讨论】：

【解决方案2】：

将皱纹应用于图像的正确方法是在 Python/OpenCV 中使用强光混合。

将（猫）图像读取为灰度并转换为范围 0 到 1
将皱纹图像读取为灰度并转换为范围 0 到 1
将皱纹图像调整为与猫图像相同的尺寸
线性拉伸皱纹动态范围，使皱纹更具对比度
对皱纹图像设置阈值并得到它的逆图像
改变皱纹图像的亮度，使平均值为中灰色（对于强光构图很重要）
将皱纹图像转换为 3 通道灰度
应用强光组合物
保存结果。

猫图片：

皱纹图片：

import cv2
import numpy as np

# read cat image and convert to float in range 0 to 1
img = cv2.imread('cat.jpg').astype("float32") / 255.0
hh, ww = img.shape[:2]

# read wrinkle image as grayscale and convert to float in range 0 to 1
wrinkles = cv2.imread('wrinkles.jpg',0).astype("float32") / 255.0

# resize wrinkles to same size as cat image
wrinkles = cv2.resize(wrinkles, (ww,hh), fx=0, fy=0)

# apply linear transform to stretch wrinkles to make shading darker
# C = A*x+B
# x=1 -> 1; x=0.25 -> 0
# 1 = A + B
# 0 = 0.25*A + B
# Solve simultaneous equations to get:
# A = 1.33
# B = -0.33
wrinkles = 1.33 * wrinkles -0.33

# threshold wrinkles and invert
thresh = cv2.threshold(wrinkles,0.5,1,cv2.THRESH_BINARY)[1]
thresh = cv2.cvtColor(thresh,cv2.COLOR_GRAY2BGR) 
thresh_inv = 1-thresh

# shift image brightness so mean is mid gray
mean = np.mean(wrinkles)
shift = mean - 0.5
wrinkles = cv2.subtract(wrinkles, shift)

# convert wrinkles from grayscale to rgb
wrinkles = cv2.cvtColor(wrinkles,cv2.COLOR_GRAY2BGR) 

# do hard light composite and convert to uint8 in range 0 to 255
# see CSS specs at https://www.w3.org/TR/compositing-1/#blendinghardlight
low = 2.0 * img * wrinkles
high = 1 - 2.0 * (1-img) * (1-wrinkles)
result = ( 255 * (low * thresh_inv + high * thresh) ).clip(0, 255).astype(np.uint8)

# save results
cv2.imwrite('cat_wrinkled.jpg', result)

# show results
cv2.imshow('Wrinkles', wrinkles)
cv2.imshow('Result', result)
cv2.waitKey(0)
cv2.destroyAllWindows()

皱纹猫图片：

【讨论】：

这看起来很有希望，尽管如果整个事情都可以生成，我仍然会更开心。我将检查它在文档上的外观。谢谢！
你还想要什么？
在一个完美的世界里？一种变换，它还可以变形图像以创建折叠和褶皱的 3D 效果，可以随机和一起应用。但是，正如我所见，这并不是一个真正的选择，除非有人在接下来的 4 天内提出一个好的实施方案，否则我或多或少会对此感到满意。我仍然需要检查它在文档上的表现，因为文档有点不同。

【解决方案3】：

这不是您问题的答案。更多的是使用适合您的应用程序的混合模式。在wiki 页面中查看有关混合模式的更多详细信息。这可能会帮助您解决质量损失。以下代码在 wiki 页面的 Multiply and Screen 下实现了前几个混合模式。这不涉及塑料包装过滤器和使用您参考的 Photoshop 教程中提供的画笔添加的效果。

您仍然需要生成叠加层（代码中的图像 b），我同意 Nelly 关于增强的评论。

import cv2 as cv
import numpy as np

a = cv.imread("image.jpg").astype(np.float32)/255.0
b = cv.imread("gradients.jpg").astype(np.float32)/255.0

multiply_blended = a*b
multiply_blended = (255*multiply_blended).astype(np.uint8)

screen_blended = 1 - (1 - a)*(1 - b)
multiply_blended = (255*screen_blended).astype(np.uint8)

overlay_blended = 2*a*b*(a < 0.5).astype(np.float32) + (1 - 2*(1 - a)*(1 - b))*(a >= 0.5).astype(np.float32)
overlay_blended = (255*overlay_blended).astype(np.uint8)

photoshop_blended = (2*a*b + a*a*(1 - 2*b))*(b < 0.5).astype(np.float32) + (2*a*(1 - b) + np.sqrt(a)*(2*b - 1))*(b >= 0.5).astype(np.float32)
photoshop_blended = (255*photoshop_blended).astype(np.uint8)

pegtop_blended = (1 - 2*b)*a*a + 2*b*a
pegtop_blended = (255*pegtop_blended).astype(np.uint8)

Photoshop 柔光：

【讨论】：

请显示您输入的渐变图像以及它是如何创建的。
@fmw42 我直接从 OP 发布的 photoshop 教程链接中获取的。
谢谢。抱歉，我忽略了链接。

【解决方案4】：

没有太多的工作，我想出了这个结果。它远非完美，但我认为它的方向是正确的。

from PIL import Image, ImageDraw, ImageFilter
import requests
from io import BytesIO

response = requests.get('https://icatcare.org/app/uploads/2018/07/Thinking-of-getting-a-cat.png')
img1 = Image.open(BytesIO(response.content))
response = requests.get('https://st2.depositphotos.com/5579432/8172/i/950/depositphotos_81721770-stock-photo-paper-texture-crease-white-paper.jpg')
img2 = Image.open(BytesIO(response.content)).resize(img1.size)

final_img = Image.blend(img1, img2, 0.5)

从这里：

还有这个：我们得到这个（混合 0.5）：或者这个（混合 0.333）：这也是一个有折叠的：

【讨论】：

感谢您为此付出的努力。不幸的是，这不适合我的目的。正如我所说，这是为了创建一个合成数据集。除非我找到成千上万张折皱或碎裂的白纸图像，否则使用这样的背景太确定了。此外，质量损失也是不可接受的。我需要能够像我包含的图像一样应用随机褶皱和皱纹。我以为这很简单......
好吧，您也可以尝试训练 GAN，您将需要原始文档和折叠/折痕文档的数据集并训练生成器。当评论家被“愚弄”并且无法区分实际的折叠图像和生成的“折叠”图像时，您可以使用生成器将任何图像增强为“折叠”图像......但这听起来仍然比找到更复杂几百个折痕/折叠的叠加层，并将它们与照片合并。您可以在叠加层上进行增强部分 - 旋转和翻转它们，组合其中的一些，甚至玩弄混合百分比、对比度等。
@Moshel Like Nelly 说，您可以通过增加一些真实世界的折痕/折叠示例来生成一组叠加层，这些可能比合成的更真实。这些叠加层的生成似乎是您项目中的主要任务。您可以使用更复杂的混合模式来处理质量损失。

【解决方案5】：

当您创建静态合成数据集时，一个更现实且可能最简单的解决方案似乎是使用DocCreator 为您随机生成数据集。

使用给定的样本：

可以生成如下数据集

通过图像 > 退化 > 颜色退化 em> > 3D 失真 然后选择 Mesh (Load mesh...) 最后点击 save random images... 按钮并选择约束。

通过更改 Phy 和 Theta 上下限，可以生成具有更细微失真的数据集。

该项目提供了一个demo，可以让人们更好地评估它是否适用于您的目的。

【讨论】：

虽然项目很有趣，但我并不是在寻找增强（这是在训练期间完成的）。我在寻找特定的一代，而不是通用的。