15 Collections Interview Questions and Answers
Prepare for your next technical interview with our guide on collections, featuring common questions and in-depth answers to enhance your understanding.
Prepare for your next technical interview with our guide on collections, featuring common questions and in-depth answers to enhance your understanding.
Collections are fundamental to efficient data management and manipulation in programming. They provide structured ways to store, organize, and access data, making them indispensable in various applications, from simple scripts to complex systems. Understanding collections and their appropriate use cases can significantly enhance your problem-solving capabilities and optimize your code’s performance.
This article offers a curated selection of interview questions focused on collections. By exploring these questions and their detailed answers, you will gain a deeper understanding of how to effectively utilize collections in your projects, thereby improving your readiness for technical interviews and practical coding challenges.
The List, Set, and Map interfaces in collections serve different purposes:
Fail-fast iterators operate directly on the collection and throw a ConcurrentModificationException
if modified during iteration, except through the iterator’s remove
method. This ensures data consistency.
Fail-safe iterators work on a clone of the collection, allowing modifications without exceptions, but may lead to outdated data.
A HashMap stores key-value pairs using an array of buckets. Each bucket is a linked list holding entries that hash to the same index. When adding a key-value pair, the key is hashed to determine the bucket index. If the bucket is empty, the entry is added; otherwise, a collision resolution strategy like chaining is used. Retrieval involves hashing the key to find the bucket and traversing the linked list for the matching entry.
LinkedList and ArrayList have different characteristics:
LinkedList:
ArrayList:
To merge two sorted lists, use a two-pointer technique to iterate through both lists, comparing elements and appending the smaller one to the result list.
Example:
def merge_sorted_lists(list1, list2): merged_list = [] i, j = 0, 0 while i < len(list1) and j < len(list2): if list1[i] < list2[j]: merged_list.append(list1[i]) i += 1 else: merged_list.append(list2[j]) j += 1 merged_list.extend(list1[i:]) merged_list.extend(list2[j:]) return merged_list list1 = [1, 3, 5, 7] list2 = [2, 4, 6, 8] print(merge_sorted_lists(list1, list2)) # Output: [1, 2, 3, 4, 5, 6, 7, 8]
HashMap:
ConcurrentHashMap:
An LRU (Least Recently Used) cache discards the least recently used items first when reaching capacity. It can be implemented using Python’s OrderedDict
to maintain insertion order and allow efficient reordering.
from collections import OrderedDict class LRUCache: def __init__(self, capacity: int): self.cache = OrderedDict() self.capacity = capacity def get(self, key: int) -> int: if key not in self.cache: return -1 else: self.cache.move_to_end(key) return self.cache[key] def put(self, key: int, value: int) -> None: if key in self.cache: self.cache.move_to_end(key) self.cache[key] = value if len(self.cache) > self.capacity: self.cache.popitem(last=False) # Example usage: lru_cache = LRUCache(2) lru_cache.put(1, 1) lru_cache.put(2, 2) print(lru_cache.get(1)) # returns 1 lru_cache.put(3, 3) # evicts key 2 print(lru_cache.get(2)) # returns -1 (not found) lru_cache.put(4, 4) # evicts key 1 print(lru_cache.get(1)) # returns -1 (not found) print(lru_cache.get(3)) # returns 3 print(lru_cache.get(4)) # returns 4
In Java, Comparable allows an object to define its natural ordering by implementing the compareTo
method. Comparator provides an external comparison strategy by implementing the compare
method, allowing multiple ways to compare objects.
Example:
import java.util.*; class Person implements Comparable<Person> { String name; int age; Person(String name, int age) { this.name = name; this.age = age; } @Override public int compareTo(Person other) { return this.age - other.age; // Natural ordering by age } @Override public String toString() { return name + " (" + age + ")"; } } class NameComparator implements Comparator<Person> { @Override public int compare(Person p1, Person p2) { return p1.name.compareTo(p2.name); // External ordering by name } } public class Main { public static void main(String[] args) { List<Person> people = new ArrayList<>(); people.add(new Person("Alice", 30)); people.add(new Person("Bob", 25)); people.add(new Person("Charlie", 35)); Collections.sort(people); // Sort by age (Comparable) System.out.println("Sorted by age: " + people); Collections.sort(people, new NameComparator()); // Sort by name (Comparator) System.out.println("Sorted by name: " + people); } }
Handling large datasets that do not fit into memory can be managed using iterators and generators to process data in chunks, reducing memory usage. The itertools
module provides functions for efficient looping, such as itertools.islice
for processing large files in chunks.
import itertools def process_large_file(file_path, chunk_size): with open(file_path, 'r') as file: while True: lines = list(itertools.islice(file, chunk_size)) if not lines: break for line in lines: process_line(line) def process_line(line): pass
The collections.deque
can maintain a fixed-size buffer of recent items, useful for streaming data.
from collections import deque def process_streaming_data(data_stream, buffer_size): buffer = deque(maxlen=buffer_size) for data in data_stream: buffer.append(data) process_buffer(buffer) def process_buffer(buffer): pass
Dijkstra’s algorithm finds the shortest path in a weighted graph by iteratively selecting the node with the smallest tentative distance, updating neighbors’ distances, and marking nodes as visited.
import heapq def dijkstra(graph, start): priority_queue = [(0, start)] distances = {node: float('inf') for node in graph} distances[start] = 0 while priority_queue: current_distance, current_node = heapq.heappop(priority_queue) if current_distance > distances[current_node]: continue for neighbor, weight in graph[current_node].items(): distance = current_distance + weight if distance < distances[neighbor]: distances[neighbor] = distance heapq.heappush(priority_queue, (distance, neighbor)) return distances graph = { 'A': {'B': 1, 'C': 4}, 'B': {'A': 1, 'C': 2, 'D': 5}, 'C': {'A': 4, 'B': 2, 'D': 1}, 'D': {'B': 5, 'C': 1} } print(dijkstra(graph, 'A')) # {'A': 0, 'B': 1, 'C': 3, 'D': 4}
To manage and query a large number of user sessions, use dictionaries for average O(1) time complexity in insertions, deletions, and lookups, and sets for quick existence checks.
from collections import defaultdict class SessionManager: def __init__(self): self.sessions = {} self.user_sessions = defaultdict(set) def create_session(self, user_id, session_id): self.sessions[session_id] = user_id self.user_sessions[user_id].add(session_id) def end_session(self, session_id): user_id = self.sessions.pop(session_id, None) if user_id: self.user_sessions[user_id].discard(session_id) def get_user_sessions(self, user_id): return self.user_sessions[user_id] # Example usage manager = SessionManager() manager.create_session('user1', 'session1') manager.create_session('user1', 'session2') manager.create_session('user2', 'session3') print(manager.get_user_sessions('user1')) # Output: {'session1', 'session2'} manager.end_session('session1') print(manager.get_user_sessions('user1')) # Output: {'session2'}
Different types of queues serve various purposes:
Weak references in Python allow referencing objects without preventing their garbage collection, useful in caching and memory-sensitive applications. The weakref
module provides tools for creating weak references, such as WeakValueDictionary
.
Example:
import weakref class MyClass: def __init__(self, value): self.value = value obj = MyClass(10) weak_dict = weakref.WeakValueDictionary() weak_dict['key'] = obj print(weak_dict['key'].value) # Output: 10 del obj print('key' in weak_dict) # Output: False
To ensure immutability of a collection, use immutable data types like tuples and frozensets. Tuples are immutable sequences, and frozensets are immutable sets.
Example:
immutable_tuple = (1, 2, 3) # immutable_tuple[0] = 4 # Raises TypeError immutable_frozenset = frozenset([1, 2, 3]) # immutable_frozenset.add(4) # Raises AttributeError
A priority queue can be implemented using a heap, providing O(log n) time complexity for insertion and deletion. The heapq
module in Python facilitates this implementation.
Example:
import heapq class PriorityQueue: def __init__(self): self.heap = [] def push(self, item, priority): heapq.heappush(self.heap, (priority, item)) def pop(self): return heapq.heappop(self.heap)[1] pq = PriorityQueue() pq.push("task1", 2) pq.push("task2", 1) pq.push("task3", 3) print(pq.pop()) # Output: task2 print(pq.pop()) # Output: task1 print(pq.pop()) # Output: task3