Detecting Parallelogram in Images
The steps used in the detection of a parallelogram
- Load the raw image and convert it to grayscale using the formula
2. Plot the histogram to see what could be the threshold value for thresholding
3. Use canny edge detection to detect the edges
4. Threshold the image
5. Apply hough transform to get the accumulator and the points corresponding to
each accumulator array value.
6. Find every Two pairs of parallel lines and then find two more pairs of parallel lines
and find their intersection points.
7. Check if the intersection points are actually a part of the edge.
8. Using the intersection point found, draw the line on the original image to show the
Parallelogram.
1. Loading image and converting it to grayscale
Fig 1: The original color image(left) and the grayscale image(right)
In this part, we load the raw image and convert it into a grayscale image. The data format of the raw image is as follows:
The data arrangement for the interleaved raw image format is illustrated below, assuming an image of size M X N (rows X columns).
R(0,0) G(0,0) B(0,0) R(0,1) G(0,1) B(0,1) ……………………………………R(0,N-1) G(0,N-1) B(0,N-1)
R(1,0) G(1,0) B(1,0) R(1,1) G(1,1) B(1,1) ……………………………………R(1,N-1) G(1,N-1) B(1,N-1)
When we load the raw image, it is thrice as wide as its grayscale image would be. To convert it to grayscale we loop over the raw image and multiply the R,G,B with their respective ration in luminance formula.
for i in range(0,rows):
for j in range(0,columns):
img_gray_man[i][j] = (0.3*img_color[i][j*3] +0.59*img_color[i][j*3+1] +0.11*img_color[i][j*3+2])
The histogram of an image can be used to get an idea of what the threshold value should be. The point where we see a deep valley between two peaks in the histogram can be a value of the threshold. Instead of a manual threshold, automatic thresholding can also be used to get the threshold value but during this project, I found that the automatic thresholding technique which uses peakiness to get the threshold value doesn’t perform well in all the cases. Peakiness depends on finding a valley between two peaks in the histogram of an image but if the histogram is a convex shape, i.e. it has a single peak the peakiness method doesn’t perform well.
Fig 2: Histogram of image 1.
In Fig 2 we see that the threshold value can be anything in the middle of these peaks.
In this part, we perform the Canny edge detection to detect the edges in the image. The Canny edge detection consists of 4 steps which are:
● Use gaussian smoothing to smooth the image. In this step, we apply a Gaussian filter to the image to remove noise.
● Apply the edge operator to the image. This will give back the image with thick edges.
● Quantize the gradient angle of the edges into three sectors.
● Apply non-maxima suppression to the image with thick edges to get thin edges. This is done with the help of the gradient angles found in the previous step. We take the gradient of a pixel location and see the value of the gradient along the sector of its gradient angle. If it is less than any of the two pixels along that direction we set its value to zero.
The formula to calculate the gradient and the gradient angles are as follows:
Fig 4: Value of Sectors depending on gradient angle
g = create_guassian_mask([3,3] , 0.0 , 1)
img_after_gaussian_mask = (apply_map(image , g , False) / sum(sum(g)))
#sobels operator returns the gradient magnitude array and the gradient direction array
image_after_sobels_operator , theta = edge_operator(img_after_gaussian_mask , "sobels")
quantized_theta = quantize_gradient_angle_to_secotrs(theta)
image_after_non_maxima_suppression = non_maxima_suppression(image_after_sobels_operator , quantized_theta)
return image_after_non_maxima_suppression
This function is in the edgedetction.py file and all the functions it calls are also in the same file. These files get called by the houghtransform.py file. All these files can be found on my Github. It first creates a gaussian mask ‘g’ which is then applied to the image to smooth it and remove noise. Then we use the Sobels edge operator to get the gradient magnitudes and gradient angles. These gradient angles are then quantized by the function quantize_gradient_angle_to_secotrs.
Fig 5: Image after canny edge detection
4. Threshold Image
In this part we threshold the image. We use the threshold value we found above to convert the grayscale image to binary image. This is based on simple comparison, if the value of the pixel is above threshold set it to 255 and if it is below set it to 0( this will be reversed depending on if you want the background to be black and foreground to be white or vice versa). After you threshold the image you will get a binary image on which we can perform the hough transform.
Fig 6: Image after thresholding
5. Apply hough transform
In this step we apply the hough transform to get the accumulator and the positions of the pixels that correspond to incrementing a particular location in the accumulator.
The Hough transform is a feature extraction technique used in image analysis, computer vision, and digital image processing.[1] The purpose of the technique is to find imperfect instances of objects within a certain class of shapes by a voting procedure. This voting procedure is carried out in a parameter space, from which object candidates are obtained as local maxima in a so-called accumulator space that is explicitly constructed by the algorithm for computing the Hough transform.
The simplest case of Hough transform is detecting straight lines. In general, the straight line y = mx + b can be represented as a point (b, m) in the parameter space. However, vertical lines pose a problem. They would give rise to unbounded values of the slope parameter m.
Hough Transform Algorithm (for straight lines)
- Quantize the parameter space appropriately
- Assume that each cell in the parameter space is an accumulator. Initialize all cells to zero
- For each point (x,y) in the image space increment by 1 each of the accumulators that satisfy the equation
- Maxima in the accumulator array correspond to the parameters of model instances
Hough transform implementation:
- First I create the theta array depending on the step size using the following code:
def get_theta_list(step_size):
if 360 % step_size != 0:
print("step_size not a factor of 180")
no_of_steps = int(round(360/step_size))
return np.linspace(0 , 360 ,no_of_steps )+(round(step_size/2))
- Then for all the points in the binary image that are foreground( 0 or 1 depending on how the foreground is represented) we increment the value of the accumulator array for all the values of theta in theta array that we got in previous step. We also store the location of the points that incremented a particular position in accumulator array in the dictionary accumulator_positions The key of this dictionary is the p or r and the theta and the value is the list of points. The code to create the accumulator array is:
def increment_accumulator(image , thetas , theta_step_size = step_size_theta, p_step_size = step_size_p):
accumulator_positions = {}
n,m = image.shape
accumulator = create_accumulator(len(thetas) , p_step_size)
accumulator_rows , accumulator_columns = accumulator.shape
# this line get the index of all the points that are non zero in the binary image
y_idxs, x_idxs = np.nonzero(image)
for z in range(len(y_idxs)):
i = y_idxs[z]
j = x_idxs[z]
for theta in thetas:
theta_position_in_accumulator = int(round((theta)/step_size_theta))-1
result = int(round((calculatep(theta , j , i)+diag_len)/step_size_p))
accumulator[result][theta_position_in_accumulator] += 1
#check if the point is already at this loc in dictionary
if((result,theta_position_in_accumulator) in accumulator_positions.keys()):
accumulator_positions[result,theta_position_in_accumulator].append((i,j))
else:
accumulator_positions[result,theta_position_in_accumulator] = [(i,j)]
return accumulator , accumulator_positions
- Once we get the accumulator array we threshold it to remove all the points that may not correspond to a straight line.
- After thresholding we find the points of intersection of parallel lines to get the corners of the parallelogram. To do this first we find two pair of parallel lines from the accumulator_pos dictionary and check if these parallel lines are some distance apart to prevent detecting very narrow parallelograms which may be of the same edge. Then we find two more parallel lines using the same dictionary and check if they are some distance apart. Now we have 4 lines which may form a parallelogram.
- To check if the 4 lines actually form a parallelogram we use the equation of these four lines and find the four intersection points. Then we check if there are enough points in between these corner points to say if they actually for a parallelogram. To check this we can find the distance between the corner points and then check if the ratio of the distance and the number points between these corner point is above some threshold.
- Once we have the corner points of the parallelogram we can draw it over the original image using the cv2.line function.
Fig 6: Image if the accumulator array(X-axis: theta , Y-axis:p)
Fig 7: Image after thresholding of accumulator array
No comments:
Post a Comment