【问题标题】:Trim scanned images with PIL?用 PIL 修剪扫描的图像?
【发布时间】:2011-11-23 15:16:09
【问题描述】:

如何修剪使用扫描仪输入并因此具有大的白色/黑色区域的图像?

【问题讨论】:

    标签: python image-processing python-imaging-library


    【解决方案1】:

    熵解似乎有问题并且计算量过大。为什么不进行边缘检测?

    我刚刚编写了这段 python 代码来为自己解决同样的问题。我的背景是肮脏的白色,所以我使用的标准是黑暗和颜色。我通过只为每个像素取最小的 R、B 或 B 值来简化这个标准,这样黑色或饱和红色都一样突出。我还使用了每一行或每一列的最暗像素的平均值。然后我从每个边缘开始,一直努力,直到我越过一个门槛。

    这是我的代码:

    #these values set how sensitive the bounding box detection is
    threshold = 200     #the average of the darkest values must be _below_ this to count (0 is darkest, 255 is lightest)
    obviousness = 50    #how many of the darkest pixels to include (1 would mean a single dark pixel triggers it)
    
    from PIL import Image
    
    def find_line(vals):
        #implement edge detection once, use many times 
        for i,tmp in enumerate(vals):
            tmp.sort()
            average = float(sum(tmp[:obviousness]))/len(tmp[:obviousness])
            if average <= threshold:
                return i
        return i    #i is left over from failed threshold finding, it is the bounds
    
    def getbox(img):
        #get the bounding box of the interesting part of a PIL image object
        #this is done by getting the darekest of the R, G or B value of each pixel
        #and finding were the edge gest dark/colored enough
        #returns a tuple of (left,upper,right,lower)
    
        width, height = img.size    #for making a 2d array
        retval = [0,0,width,height] #values will be disposed of, but this is a black image's box 
    
        pixels = list(img.getdata())
        vals = []                   #store the value of the darkest color
        for pixel in pixels:
            vals.append(min(pixel)) #the darkest of the R,G or B values
    
        #make 2d array
        vals = np.array([vals[i * width:(i + 1) * width] for i in xrange(height)])
    
        #start with upper bounds
        forupper = vals.copy()
        retval[1] = find_line(forupper)
    
        #next, do lower bounds
        forlower = vals.copy()
        forlower = np.flipud(forlower)
        retval[3] = height - find_line(forlower)
    
        #left edge, same as before but roatate the data so left edge is top edge
        forleft = vals.copy()
        forleft = np.swapaxes(forleft,0,1)
        retval[0] = find_line(forleft)
    
        #and right edge is bottom edge of rotated array
        forright = vals.copy()
        forright = np.swapaxes(forright,0,1)
        forright = np.flipud(forright)
        retval[2] = width - find_line(forright)
    
        if retval[0] >= retval[2] or retval[1] >= retval[3]:
            print "error, bounding box is not legit"
            return None
        return tuple(retval)
    
    if __name__ == '__main__':
        image = Image.open('cat.jpg')
        box = getbox(image)
        print "result is: ",box
        result = image.crop(box)
        result.show()
    

    【讨论】:

    • 令我懊恼的是,这个答案只适用于小图像。 list(img.getdata()) 使我的整个计算机因我正在使用的较大图像而崩溃(我的是 4Mb,但我读到其他人报告的类似结果只有 1 Mb 图像)。
    • “正确”的答案使用 'pixels = numpy.asarray(img)' 而不是 getdata(),然后必须使用 itertools.imap 处理生成的 numpy 数组。我被困在这一点上。我在stackoverflow.com/questions/6136588/image-cropping-using-python/… 发布了我决定的解决方案
    【解决方案2】:

    对于初学者,Here is a similar questionHere is a related questionAnd a another related question.

    这只是一个想法,当然还有其他方法。我会选择任意裁切边缘,然后测量线两侧的entropy*,然后继续重新选择裁切线(可能使用类似二等分的方法),直到裁切部分的熵下降低于定义的阈值。正如我认为的那样,您可能需要采用粗暴的寻根方法,因为您无法很好地指示何时收割得太少。然后对剩余的 3 条边重复此操作。

    *我记得发现引用网站中的熵方法并不完全准确,但我找不到我的笔记(不过我确信它是在 SO 帖子中。)

    编辑: 图像部分“空白”的其他标准(熵除外)可能是对比度或边缘检测结果的对比度。

    【讨论】:

      猜你喜欢
      • 2013-09-13
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2018-04-01
      • 1970-01-01
      • 2013-03-06
      • 2012-10-01
      相关资源
      最近更新 更多