如何改进 Python 中异构结构的分水岭分割？答案

【问题标题】：How can I improve Watershed segmentation of heterogenous structures in Python?如何改进 Python 中异构结构的分水岭分割？
【发布时间】：2021-05-02 07:34:03
【问题描述】：

我正在遵循一种简单的方法来使用 Python 中的分水岭算法来分割细胞（显微镜图像）。 90% 的时间我都对结果感到满意，但我有两个主要问题：（i）标记/轮廓真的很“尖尖”，（2）当两个细胞彼此靠近时，算法有时会失败（即它们被分割在一起）。您能提供一些改进的建议吗？

这是我正在使用的代码和显示我的 2 个问题的输出图像。

# Adjustable parameters for a future function
img_file = NP_file
sigma = 9 # size of gaussian blur kernel; has to be an even number
alpha = 0.2 #scalling factor distance transform
clear_border = False
remove_small_objects = True

# read image and covert to gray scale 
im = cv2.imread(NP_file, 1)
im = enhanceContrast(im)
im_gray = cv2.cvtColor(im.copy(), cv2.COLOR_BGR2GRAY)

# Basic Median Filter
im_blur = cv2.medianBlur(im_gray, ksize = sigma)

# Threshold Image
th, im_seg = cv2.threshold(im_blur, im_blur.mean(), 255, cv2.THRESH_BINARY);

# filling holes in the segmented image
im_filled = binary_fill_holes(im_seg)

# discard cells touching the border
if clear_border == True: 
    im_filled = skimage.segmentation.clear_border(im_filled)

# filter small particles
if remove_small_objects == True: 
    im_filled = sk.morphology.remove_small_objects(im_filled, min_size = 5000)

# apply distance transform
# labels each pixel of the image with the distance to the nearest obstacle pixel.
# In this case, obstacle pixel is a boundary pixel in a binary image.

dist_transform = cv2.distanceTransform(img_as_ubyte(im_filled), cv2.DIST_L2, 3)

# get sure foreground area: region near to center of object
fg_val, sure_fg = cv2.threshold(dist_transform, alpha * dist_transform.max(), 255, 0)

# get sure background area: region much away from the object
sure_bg = cv2.dilate(img_as_ubyte(im_filled), np.ones((3,3),np.uint8), iterations = 6)
    
# The remaining regions (borders) are those which we don’t know if they are img or background
borders = cv2.subtract(sure_bg, np.uint8(sure_fg))

# use Connected Components labelling: 
# scans an image and groups its pixels into components based on pixel connectivity
# label background of the image with 0 and other objects with integers starting from 1.

n_markers, markers1 = cv2.connectedComponents(np.uint8(sure_fg))

# filter small particles again! (bc of segmentation artifacts)
if remove_small_objects == True: 
    markers1 = sk.morphology.remove_small_objects(markers1, min_size = 1000)
    
# Make sure the background is 1 and not 0; 
# and that borders are marked as 0
markers2 = markers1 + 1
markers2[borders == 255] = 0

# implement the watershed algorithm: connects markers with original image
# The label image will be modified and the marker in the border area will change to -1
im_out = im.copy()
markers3 = cv2.watershed(im_out, markers2)

# generate an extra image with color labels only for visuzalization
# color markers in BLUE (pixels = -1 after watershed algorithm)
im_out[markers3 == -1] = [0, 255, 255]

如果您想尝试重现我的结果，您可以在此处找到我的 .tif 文件： https://drive.google.com/file/d/13KfyUVyHodtEOP_yKAnfFCAhgyoY0BQL/view?usp=sharing

谢谢！

【问题讨论】：

从您的结果来看，尖刺部分似乎是背景阈值处理的伪影，您可以尝试使用高斯滤波器进一步平滑它。距离变换可用于在非凸区域上找到多个局部最小值，这有助于分离多个单元格，但不能解决所有情况。
如果您提供原始图像，我们可以复制您的结果。
感谢@joOkuma 的反馈！我最终通过对距离变换进行阈值化来扩展输出图像来改善尖峰，但我仍然遇到触摸物体的问题。我编辑了我的帖子并添加了指向我的一个人物的链接！

标签： python opencv image-segmentation watershed

【解决方案1】：

过去，我应用分水岭算法的最佳方法是“仅在需要时”。它是计算密集型的，图像中的大多数单元都不需要它。这是我对您的图像使用的代码：

# Threshold your image
# This example worked very well with a threshold value of 1
tv, thresh = cv2.threshold(cv2.cvtColor(img, cv2.COLOR_BGR2GRAY), 1, 255, cv2.THRESH_BINARY)

# Minimize the holes in the cells to facilitate finding contours
for i in range(5):
    thresh = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, np.ones((3,3)))
    thresh = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, np.ones((3,3)))

# Find contours and keep the ones big enough to be a cell
contours, _ = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
contours = [c for c in contours if cv2.contourArea(c) > 400]
output = np.zeros_like(thresh)
cv2.drawContours(output, contours, -1, 255, -1)
for i, contour in enumerate(contours):
    x, y, w, h = cv2.boundingRect(contour)
    cv2.putText(output, f"{i}", (x, y), cv2.FONT_HERSHEY_PLAIN, 1, 255, 2)

这段代码的输出是这个图像：如您所见，只有一对单元格（轮廓 #7）需要使用分水岭算法进行拆分。在该单元格上运行分水岭算法非常快（使用较小的图像），结果如下：

编辑一些细胞形态学计算，可用于评估是否应该对图像中的对象运行分水岭算法：

# area
area = cv2.contourArea(contour)
# perimeter, with the minimum value = 0.01 to avoid division by zero in other calculations
perimeter = max(0.01, cv2.arcLength(contour, True))
# circularity
circularity = (4 * math.pi * area) / (perimeter ** 2)
# Check if the cell is convex (not smoothly elliptical)
hull = cv2.convexHull(contour)
convexity = cv2.arcLength(hull, True) / perimeter
approx = cv2.approxPolyDP(contour, 0.1 * perimeter, True)
convex = cv2.isContourConvex(approx)

您需要找到项目中每个测量的阈值。在我的项目中，细胞是椭圆形的，有一个大面积凸出的斑点通常意味着有 2 个或更多细胞聚集在一起。

【讨论】：

是的，只在我需要的单元格上运行分水岭是有意义的，但我怎样才能以自动化的方式做到这一点？即我正在编写一个脚本以将原始图像作为输入并将分段的无花果作为输出返回。所以我需要一套规则来告诉 python 选择哪些细胞作为分水岭......实际上，这将是一件很棒的事情，因为很难为分水岭算法生成相同大小的种子。如您所见，单元格具有各种奇怪的形状，因此当我为分水岭的距离变换设置阈值时，一些单元格被过度分割，反之亦然
我过去的做法是评估细胞形态。就我而言，我查看了细胞的面积、圆形度和凸度。我将编辑我的答案以包含一些计算作为潜在的指导。
这在我的项目中可能很棘手，因为我能找到所有可能的形状。但我喜欢这个理由。我将尝试实施并看看它是如何进行的。谢谢！干杯！