Region Segmentation
What we will be doing on this blog?
We will segment an image into two regions ie foreground and background and then assign a unique label to each object in the foreground.
A region is a subset of an image. Region segmentation is grouping the pixels in an image into regions.
To segment an image into different regions we need to differentiate the regions. There can be many things on which the differentiation can be done. In this blog, we will use the pixel intensity as differentiating factor. For example, let's assume we have an 8-bit(pixel values can be from 0 to 255) grayscale image and we decide that every pixel above the intensity 50 belongs to the foreground and less than equal to 50 belongs to the background. In this way, we have created 2 different regions in the image ie the foreground and the background. This method of deciding the foreground and the background based on a threshold value is called thresholding.
Thresholding
There are three types of thresholding:
- Automatic Thresholding
- P-tile Thresholding (Manual thresholding)
- Choose a threshold value by looking at the image
P-tile Thresholding(Manual): If we know that one or more objects occupy a fixed area on the image, then we can choose a threshold T such that p percent of pixels would have gray level value < T.
Automatic Thresholding: In this method, the program automatically chooses one or more threshold values and segment the image based on those values.
For simplicity, in this post, we will be setting the threshold manually by looking at the image. Looking at the image directly may not give you an idea of what the threshold should be so what we can do is plot the histogram of the image and try to find the threshold from that. For example look at the image(image 1) below, you may not be able to see what the threshold should be.
Now see the histogram of this image. In the histogram, you can see that there are two peaks, one with a peak around 70 and the other with a peak around 180. These two peaks represent the two regions in the image. The background is dark so the peak corresponding to the background will be the one peak around 70 and the coins are light gray so the peak corresponding to it is around 180.
So, to pick a threshold we select a low value between these peaks. In this histogram, this value will be around 110.
Histogram: x-axis is the pixel intensity and the y-axis is the number of pixels at that intensity.
Segmenting the image into foreground and background
Now that we have got the threshold value next step is to convert the image into a binary image. A binary image is an image with only 2 pixel-intensities ie 0 and 1. Let us represent the background by 0 and the foreground by 1(this is not compulsory you can use the opposite of it as well). To convert the grayscale image to binary image, create a new zero array of the size of the original image, loop through all the pixels of the original image. If the intensity of the pixel is above the threshold set the corresponding pixel in the zero array to 1 and let it be zero otherwise. Look at the code below to understand better.
rows,columns = image.shape
binary_image = np.zeros((rows , columns))
for i in range(0, image_rows_len):
for j in range(0, image_columns_len):
if(image[i][j] > 110):
binary_image[i][j] = 1
Now that we have the image of two regions we can see that it has 10 components(10 coins). We can assign a unique label to each of these components using the component labeling algorithm. Before I tell you what this algorithm is there are a few terminologies that you should know.
The first one is Neighbors.
The following image shows the result of the labeling algorithm.
The connected component labeling algorithm is the following:
This algorithm uses the 4 neighbor definition of connected component ie two pixels share a boundary and both of them belong to the foreground then they are considered as connected. In other words, two pixels are connected if there a 4-path between them.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
def segment_region(img):
labels = img * 0
n, m = img.shape
eqivalenceList = []
# increment label number only if you assign a new label
label = 1
#looping through every pixel of the binary image
for i in range(0, n):
for j in range(0, m):
#we only run this algorithm on foreground pixels
if (img[i][j] == 1):
# check if row above exists
if (i > 0 and j > 0):
# we are not in the top row or leftmost column, label pixel if 1
if (img[i][j] == 1):
# check if the pixel above it has a label
if (labels[i - 1][j] != 0):
labels[i][j] = labels[i - 1][j]
else:
# pixel above doesn't have a label, check left neighbor pixel
if (labels[i][j - 1] != 0):
labels[i][j] = labels[i][j - 1]
else:
# it doesn't have a top or left neighboring labelled pixel so assign a new pixel
labels[i][j] = label
label += 1
# if both top or left neighboring pixels are there, we have to insert
# them in equivalence table
#this is the case when we are assigning label to current pixel and its top and left neighbor have different labels
#As all three of them would be connected in a 4 path they all should belong to same label(region)
if(labels[i - 1][j] != 0 and labels[i][j - 1] != 0 and labels[i][j - 1] != labels[i - 1][j]):
# check if this pair of values are not already in the list
inserted = False
for evalList in eqivalenceList:
if (labels[i - 1][j] in evalList):
evalList.append(labels[i][j - 1])
inserted = True
break
elif (labels[i][j - 1] in evalList):
evalList.append(labels[i - 1][j])
inserted = True
break
if (not inserted):
eqivalenceList.append([labels[i][j - 1], labels[i - 1][j]])
else:
# we are in the top row or the leftmost column
# check if we are in the top row. If we are in the top row and there is a pixel on the left
# that is labelled , then we can just copy the label.
if (i == 0 and j > 0):
if (labels[i][j - 1] != 0):
labels[i][j] = labels[i][j - 1]
# we are in the leftmost column
# check if we are in the leftmost column. If we are in the leftmost column and there is a pixel on the top
# that is labelled , then we can just copy the label.
if (i > 0 and j == 0):
if (labels[i - 1][j] != 0):
labels[i][j] = labels[i - 1][j]
# case 0,0, top most row , leftmost column
if (i == 0 and j == 0):
if (img[i][j] == 1):
labels[i][j] = label
label = label + 1
# length of equivalence list does not tell the number of regions. foreground that are one pixel in area
# or other regions in which only one label is set to all the pixels in the regions will not be in the
# equivalance list
#Right now each region may contain more than one labels. Here we set the minimum label that the region has to that region.
for evalList in eqivalenceList:
for i in range(0, n):
for j in range(0, m):
if (labels[i][j] in evalList):
labels[i][j] = min(evalList)
# to get the total number of regions see the number of different values in labels array
# background will be given label 0
diffLabels = []
for i in range(0, n):
for j in range(0, m):
if (labels[i][j] not in diffLabels):
diffLabels.append(labels[i][j])
return diffLabels, labels
rows,columns = image.shape
binary_image = np.zeros((rows , columns))
for i in range(0, image_rows_len):
for j in range(0, image_columns_len):
if(image[i][j] > 110):
binary_image[i][j] = 1
Now that we have the image of two regions we can see that it has 10 components(10 coins). We can assign a unique label to each of these components using the component labeling algorithm. Before I tell you what this algorithm is there are a few terminologies that you should know.
The first one is Neighbors.
The following image shows the result of the labeling algorithm.
The connected component labeling algorithm is the following:
This algorithm uses the 4 neighbor definition of connected component ie two pixels share a boundary and both of them belong to the foreground then they are considered as connected. In other words, two pixels are connected if there a 4-path between them.1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 | def segment_region(img): labels = img * 0 n, m = img.shape eqivalenceList = [] # increment label number only if you assign a new label label = 1 #looping through every pixel of the binary image for i in range(0, n): for j in range(0, m): #we only run this algorithm on foreground pixels if (img[i][j] == 1): # check if row above exists if (i > 0 and j > 0): # we are not in the top row or leftmost column, label pixel if 1 if (img[i][j] == 1): # check if the pixel above it has a label if (labels[i - 1][j] != 0): labels[i][j] = labels[i - 1][j] else: # pixel above doesn't have a label, check left neighbor pixel if (labels[i][j - 1] != 0): labels[i][j] = labels[i][j - 1] else: # it doesn't have a top or left neighboring labelled pixel so assign a new pixel labels[i][j] = label label += 1 # if both top or left neighboring pixels are there, we have to insert # them in equivalence table #this is the case when we are assigning label to current pixel and its top and left neighbor have different labels #As all three of them would be connected in a 4 path they all should belong to same label(region) if(labels[i - 1][j] != 0 and labels[i][j - 1] != 0 and labels[i][j - 1] != labels[i - 1][j]): # check if this pair of values are not already in the list inserted = False for evalList in eqivalenceList: if (labels[i - 1][j] in evalList): evalList.append(labels[i][j - 1]) inserted = True break elif (labels[i][j - 1] in evalList): evalList.append(labels[i - 1][j]) inserted = True break if (not inserted): eqivalenceList.append([labels[i][j - 1], labels[i - 1][j]]) else: # we are in the top row or the leftmost column # check if we are in the top row. If we are in the top row and there is a pixel on the left # that is labelled , then we can just copy the label. if (i == 0 and j > 0): if (labels[i][j - 1] != 0): labels[i][j] = labels[i][j - 1] # we are in the leftmost column # check if we are in the leftmost column. If we are in the leftmost column and there is a pixel on the top # that is labelled , then we can just copy the label. if (i > 0 and j == 0): if (labels[i - 1][j] != 0): labels[i][j] = labels[i - 1][j] # case 0,0, top most row , leftmost column if (i == 0 and j == 0): if (img[i][j] == 1): labels[i][j] = label label = label + 1 # length of equivalence list does not tell the number of regions. foreground that are one pixel in area # or other regions in which only one label is set to all the pixels in the regions will not be in the # equivalance list #Right now each region may contain more than one labels. Here we set the minimum label that the region has to that region. for evalList in eqivalenceList: for i in range(0, n): for j in range(0, m): if (labels[i][j] in evalList): labels[i][j] = min(evalList) # to get the total number of regions see the number of different values in labels array # background will be given label 0 diffLabels = [] for i in range(0, n): for j in range(0, m): if (labels[i][j] not in diffLabels): diffLabels.append(labels[i][j]) return diffLabels, labels |