OpenCV-Python is a powerful library for computer vision tasks, offering a wide range of functionalities for image and video processing. Its ease of use and extensive documentation make it a popular choice for both beginners and experienced developers. OpenCV-Python is widely used in various fields such as robotics, real-time object detection, and augmented reality, making it a valuable skill for many technical roles.
This article provides a curated selection of interview questions designed to test your knowledge and proficiency with OpenCV-Python. By working through these questions and their detailed answers, you will gain a deeper understanding of key concepts and be better prepared to demonstrate your expertise in interviews.
OpenCV-Python Interview Questions and Answers
1. How would you resize an image to half its original size and rotate it by 45 degrees? Provide the code.
To resize an image to half its original size and rotate it by 45 degrees using OpenCV-Python, use the cv2.resize
and cv2.getRotationMatrix2D
functions. First, resize the image by specifying the new dimensions, which are half of the original. Then, create a rotation matrix using cv2.getRotationMatrix2D
and apply this matrix to the image with cv2.warpAffine
.
import cv2 # Load the image image = cv2.imread('path_to_image.jpg') # Get the dimensions of the image height, width = image.shape[:2] # Resize the image to half its original size resized_image = cv2.resize(image, (width // 2, height // 2)) # Get the center of the image center = (width // 4, height // 4) # Get the rotation matrix for rotating the image by 45 degrees rotation_matrix = cv2.getRotationMatrix2D(center, 45, 1.0) # Rotate the image rotated_image = cv2.warpAffine(resized_image, rotation_matrix, (width // 2, height // 2)) # Save or display the image cv2.imwrite('resized_rotated_image.jpg', rotated_image)
2. Implement the Canny edge detection algorithm on an image and explain the parameters involved.
The Canny edge detection algorithm is a multi-stage process used to detect edges in images. It involves noise reduction, gradient calculation, non-maximum suppression, and edge tracking by hysteresis. The key parameters are:
- Threshold1 and Threshold2: Lower and upper thresholds for the hysteresis procedure. Edges with intensity gradient values below Threshold1 are discarded, and those above Threshold2 are considered strong edges. Edges with gradient values between the two thresholds are classified based on their connectivity to strong edges.
- Aperture Size: Size of the Sobel kernel used for finding image gradients. It must be an odd number (e.g., 3, 5, 7).
- L2gradient: A boolean parameter that specifies whether to use a more accurate L2 norm (true) or the default L1 norm (false) for gradient magnitude calculation.
Example of implementing the Canny edge detection algorithm using OpenCV in Python:
import cv2 import numpy as np # Load the image image = cv2.imread('image.jpg', cv2.IMREAD_GRAYSCALE) # Apply GaussianBlur to reduce noise blurred_image = cv2.GaussianBlur(image, (5, 5), 1.4) # Apply Canny edge detection edges = cv2.Canny(blurred_image, threshold1=50, threshold2=150, apertureSize=3, L2gradient=True) # Display the result cv2.imshow('Canny Edges', edges) cv2.waitKey(0) cv2.destroyAllWindows()
3. Write a script to find and draw contours around objects in a binary image.
To find and draw contours around objects in a binary image using OpenCV-Python, use the cv2.findContours
and cv2.drawContours
functions. The cv2.findContours
function retrieves contours from the binary image, while cv2.drawContours
draws these contours on the image.
Example:
import cv2 # Load the binary image image = cv2.imread('binary_image.png', cv2.IMREAD_GRAYSCALE) # Find contours contours, _ = cv2.findContours(image, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE) # Draw contours cv2.drawContours(image, contours, -1, (255, 255, 255), 2) # Save or display the result cv2.imwrite('contours_image.png', image)
4. Perform template matching to find a small object within a larger image. Provide the code.
Template matching is a technique in OpenCV used to find the location of a template image within a larger image. It involves sliding the template image over the larger image and comparing the template and patch of the image under the template image. The result is a grayscale image where each pixel denotes how much the neighborhood of that pixel matches with the template.
Example:
import cv2 import numpy as np # Load the large image and the template image large_image = cv2.imread('large_image.jpg', 0) template = cv2.imread('template.jpg', 0) # Get the width and height of the template w, h = template.shape[::-1] # Perform template matching result = cv2.matchTemplate(large_image, template, cv2.TM_CCOEFF_NORMED) # Set a threshold to identify the match threshold = 0.8 loc = np.where(result >= threshold) # Draw a rectangle around the matched region for pt in zip(*loc[::-1]): cv2.rectangle(large_image, pt, (pt[0] + w, pt[1] + h), (0, 255, 0), 2) # Save the result cv2.imwrite('result.jpg', large_image)
5. Describe the steps involved in camera calibration and provide a code example for calibrating a camera using a set of chessboard images.
Camera calibration is a process used to determine the parameters of a camera to correct for lens distortion and to understand the camera’s intrinsic and extrinsic properties. The steps involved in camera calibration using OpenCV-Python are:
- Capture multiple images of a known calibration pattern (e.g., a chessboard) from different angles.
- Detect the corners of the chessboard in each image.
- Use the detected corners to compute the camera matrix, distortion coefficients, rotation, and translation vectors.
- Use these parameters to undistort images taken with the camera.
Example for calibrating a camera using a set of chessboard images:
import cv2 import numpy as np import glob # Define the chessboard size chessboard_size = (9, 6) # Prepare object points (0,0,0), (1,0,0), (2,0,0), ..., (8,5,0) objp = np.zeros((chessboard_size[0] * chessboard_size[1], 3), np.float32) objp[:, :2] = np.mgrid[0:chessboard_size[0], 0:chessboard_size[1]].T.reshape(-1, 2) # Arrays to store object points and image points from all images objpoints = [] imgpoints = [] # Load images images = glob.glob('chessboard_images/*.jpg') for fname in images: img = cv2.imread(fname) gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) # Find the chessboard corners ret, corners = cv2.findChessboardCorners(gray, chessboard_size, None) if ret: objpoints.append(objp) imgpoints.append(corners) # Draw and display the corners cv2.drawChessboardCorners(img, chessboard_size, corners, ret) cv2.imshow('img', img) cv2.waitKey(500) cv2.destroyAllWindows() # Camera calibration ret, camera_matrix, dist_coeffs, rvecs, tvecs = cv2.calibrateCamera(objpoints, imgpoints, gray.shape[::-1], None, None) # Save the camera calibration result np.savez('camera_calibration.npz', camera_matrix=camera_matrix, dist_coeffs=dist_coeffs, rvecs=rvecs, tvecs=tvecs)
6. Write a script to capture video from a webcam, convert each frame to grayscale, and display it in real-time.
To capture video from a webcam, convert each frame to grayscale, and display it in real-time using OpenCV in Python, follow these steps:
- Import the necessary libraries.
- Initialize the webcam.
- Capture frames in a loop.
- Convert each frame to grayscale.
- Display the grayscale frame.
- Release the webcam and close the display window when done.
Example:
import cv2 # Initialize the webcam cap = cv2.VideoCapture(0) while True: # Capture frame-by-frame ret, frame = cap.read() # Convert the frame to grayscale gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY) # Display the resulting frame cv2.imshow('Grayscale Video', gray) # Break the loop on 'q' key press if cv2.waitKey(1) & 0xFF == ord('q'): break # Release the webcam and close windows cap.release() cv2.destroyAllWindows()
7. Use a pre-trained deep learning model (e.g., a face detector) with OpenCV to detect faces in an image. Provide the code.
To use a pre-trained deep learning model with OpenCV for face detection, utilize models like the Caffe-based ResNet model provided by OpenCV’s DNN module. This involves loading the model, reading the image, and then detecting faces.
import cv2 # Load the pre-trained model modelFile = "res10_300x300_ssd_iter_140000.caffemodel" configFile = "deploy.prototxt" net = cv2.dnn.readNetFromCaffe(configFile, modelFile) # Read the image image = cv2.imread("example.jpg") (h, w) = image.shape[:2] # Prepare the image for the model blob = cv2.dnn.blobFromImage(cv2.resize(image, (300, 300)), 1.0, (300, 300), (104.0, 177.0, 123.0)) # Perform face detection net.setInput(blob) detections = net.forward() # Draw bounding boxes around detected faces for i in range(0, detections.shape[2]): confidence = detections[0, 0, i, 2] if confidence > 0.5: box = detections[0, 0, i, 3:7] * np.array([w, h, w, h]) (startX, startY, endX, endY) = box.astype("int") cv2.rectangle(image, (startX, startY), (endX, endY), (0, 255, 0), 2) # Display the output cv2.imshow("Output", image) cv2.waitKey(0) cv2.destroyAllWindows()
8. Implement object detection using the YOLO algorithm. Provide the code.
YOLO (You Only Look Once) is a popular object detection algorithm known for its speed and accuracy. It divides the image into a grid and predicts bounding boxes and probabilities for each grid cell. YOLO is efficient because it makes predictions with a single forward pass through the network.
Example of implementing object detection using the YOLO algorithm in OpenCV-Python:
import cv2 import numpy as np # Load YOLO net = cv2.dnn.readNet("yolov3.weights", "yolov3.cfg") layer_names = net.getLayerNames() output_layers = [layer_names[i[0] - 1] for i in net.getUnconnectedOutLayers()] # Load image img = cv2.imread("image.jpg") height, width, channels = img.shape # Prepare the image for YOLO blob = cv2.dnn.blobFromImage(img, 0.00392, (416, 416), (0, 0, 0), True, crop=False) net.setInput(blob) outs = net.forward(output_layers) # Process the output class_ids = [] confidences = [] boxes = [] for out in outs: for detection in out: scores = detection[5:] class_id = np.argmax(scores) confidence = scores[class_id] if confidence > 0.5: center_x = int(detection[0] * width) center_y = int(detection[1] * height) w = int(detection[2] * width) h = int(detection[3] * height) x = int(center_x - w / 2) y = int(center_y - h / 2) boxes.append([x, y, w, h]) confidences.append(float(confidence)) class_ids.append(class_id) # Apply Non-Max Suppression indices = cv2.dnn.NMSBoxes(boxes, confidences, 0.5, 0.4) # Draw bounding boxes for i in indices: i = i[0] box = boxes[i] x, y, w, h = box[0], box[1], box[2], box[3] cv2.rectangle(img, (x, y), (x + w, y + h), (0, 255, 0), 2) # Display the image cv2.imshow("Image", img) cv2.waitKey(0) cv2.destroyAllWindows()
9. Calculate optical flow between two consecutive frames in a video. Provide the code.
Optical flow is a technique used in computer vision to determine the motion of objects between two consecutive frames in a video. It is useful for applications such as motion detection, object tracking, and video compression. OpenCV provides built-in functions to calculate optical flow.
Example of calculating optical flow between two consecutive frames using OpenCV in Python:
import cv2 import numpy as np # Load two consecutive frames frame1 = cv2.imread('frame1.jpg') frame2 = cv2.imread('frame2.jpg') # Convert frames to grayscale gray1 = cv2.cvtColor(frame1, cv2.COLOR_BGR2GRAY) gray2 = cv2.cvtColor(frame2, cv2.COLOR_BGR2GRAY) # Calculate optical flow using Farneback method flow = cv2.calcOpticalFlowFarneback(gray1, gray2, None, 0.5, 3, 15, 3, 5, 1.2, 0) # Visualize the optical flow hsv = np.zeros_like(frame1) hsv[..., 1] = 255 magnitude, angle = cv2.cartToPolar(flow[..., 0], flow[..., 1]) hsv[..., 0] = angle * 180 / np.pi / 2 hsv[..., 2] = cv2.normalize(magnitude, None, 0, 255, cv2.NORM_MINMAX) rgb_flow = cv2.cvtColor(hsv, cv2.COLOR_HSV2BGR) cv2.imshow('Optical Flow', rgb_flow) cv2.waitKey(0) cv2.destroyAllWindows()
10. Implement face recognition using a deep learning model. Provide the code.
To implement face recognition using a deep learning model with OpenCV-Python, use pre-trained models such as those provided by OpenCV’s DNN module. The process involves loading the pre-trained model, detecting faces in an image, and then recognizing those faces.
Example:
import cv2 # Load the pre-trained deep learning model for face detection face_net = cv2.dnn.readNetFromCaffe('deploy.prototxt', 'res10_300x300_ssd_iter_140000.caffemodel') # Load the pre-trained deep learning model for face recognition recognition_net = cv2.dnn.readNetFromTorch('openface_nn4.small2.v1.t7') # Read the input image image = cv2.imread('input.jpg') (h, w) = image.shape[:2] # Prepare the image for face detection blob = cv2.dnn.blobFromImage(cv2.resize(image, (300, 300)), 1.0, (300, 300), (104.0, 177.0, 123.0)) face_net.setInput(blob) detections = face_net.forward() # Loop over the detections for i in range(0, detections.shape[2]): confidence = detections[0, 0, i, 2] # Filter out weak detections if confidence > 0.5: box = detections[0, 0, i, 3:7] * np.array([w, h, w, h]) (startX, startY, endX, endY) = box.astype("int") # Extract the face ROI face = image[startX:endX, startY:endY] face_blob = cv2.dnn.blobFromImage(face, 1.0 / 255, (96, 96), (0, 0, 0), swapRB=True, crop=False) recognition_net.setInput(face_blob) vec = recognition_net.forward() # Here, you would compare 'vec' with known face embeddings to recognize the face # For simplicity, this example does not include the comparison step # Draw the bounding box of the face along with the associated probability text = "{:.2f}%".format(confidence * 100) y = startY - 10 if startY - 10 > 10 else startY + 10 cv2.rectangle(image, (startX, startY), (endX, endY), (0, 0, 255), 2) cv2.putText(image, text, (startX, y), cv2.FONT_HERSHEY_SIMPLEX, 0.45, (0, 0, 255), 2) # Display the output image cv2.imshow("Output", image) cv2.waitKey(0) cv2.destroyAllWindows()