15 SQL Join Query Interview Questions and Answers
Prepare for your technical interview with our guide on SQL join queries. Enhance your data manipulation skills and boost your SQL proficiency.
Prepare for your technical interview with our guide on SQL join queries. Enhance your data manipulation skills and boost your SQL proficiency.
SQL join queries are fundamental for anyone working with relational databases. They enable the combination of data from multiple tables, allowing for more complex and insightful data analysis. Mastering SQL joins is crucial for tasks ranging from simple data retrieval to intricate data warehousing and business intelligence operations. Understanding the various types of joins and their applications can significantly enhance your ability to manipulate and analyze data effectively.
This article offers a curated selection of SQL join query questions designed to help you prepare for technical interviews. By working through these examples, you’ll gain a deeper understanding of how to construct and optimize join queries, ensuring you are well-prepared to demonstrate your SQL proficiency.
In SQL, joins are used to combine rows from two or more tables based on a related column. There are several types of joins, including INNER JOIN, LEFT JOIN, RIGHT JOIN, and FULL OUTER JOIN. The INNER JOIN returns only rows with matching values in both tables.
Here is an example of joining two tables and selecting specific columns:
SELECT a.column1, a.column2, b.column3, b.column4 FROM table1 a INNER JOIN table2 b ON a.common_column = b.common_column;
In this example, table1
and table2
are joined using the INNER JOIN
keyword, combining rows where common_column
matches. The SELECT
statement specifies the columns to retrieve.
An inner join combines rows from two tables based on a related column, returning only rows with matching values. This is useful for retrieving related data across multiple tables.
Example:
Suppose we have employees
and departments
tables. The employees
table contains employee details, and the departments
table contains department details. Both have a common column department_id
.
SELECT employees.employee_id, employees.employee_name, departments.department_name FROM employees INNER JOIN departments ON employees.department_id = departments.department_id;
This query selects employee_id
and employee_name
from employees
and department_name
from departments
, combining rows where department_id
matches.
A left outer join includes all records from the left table and matching records from the right table. If there is no match, the result is NULL for the right table.
Example:
SELECT a.*, b.* FROM left_table a LEFT OUTER JOIN right_table b ON a.id = b.id;
This query includes all records from left_table
and matches from right_table
based on a.id = b.id
. If no match exists, NULL values appear for right_table
columns.
A self join is a regular join where a table is joined with itself. This is useful for finding related records within the same table, such as employees and their managers.
Example:
SELECT e1.employee_id, e1.employee_name, e2.employee_name AS manager_name FROM employees e1 JOIN employees e2 ON e1.manager_id = e2.employee_id;
This query joins the employees
table with itself using manager_id
to find each employee’s manager, showing the employee’s ID, name, and manager’s name.
A cross join combines all rows from two tables, resulting in a Cartesian product. This generates combinations of data from two tables.
Example:
SELECT * FROM table1 CROSS JOIN table2;
Every row from table1
is combined with every row from table2
, creating a Cartesian product.
When joining tables using multiple conditions, specify these in the ON clause. This filters rows based on multiple criteria.
Example:
SELECT a.column1, a.column2, b.column3, b.column4 FROM table1 a JOIN table2 b ON a.common_column1 = b.common_column1 AND a.common_column2 = b.common_column2;
table1
and table2
are joined based on two conditions: common_column1
and common_column2
. The result includes rows where both conditions are met.
A full outer join returns all records when there is a match in either table. It combines results of both left and right outer joins. This is useful for retrieving all records from both tables, even if some do not have corresponding entries.
Example:
SELECT A.id, A.name, B.address FROM TableA A FULL OUTER JOIN TableB B ON A.id = B.id;
TableA
and TableB
are joined on id
. The result includes all records from both tables, with NULL for non-matching records.
To combine an inner join and a left outer join in a single statement:
SELECT a.*, b.*, c.* FROM tableA a INNER JOIN tableB b ON a.id = b.a_id LEFT OUTER JOIN tableC c ON a.id = c.a_id;
tableA
is joined with tableB
using an inner join and with tableC
using a left outer join, ensuring all rows from tableA
are included, along with matches from tableB
and tableC
.
A subquery is a query nested inside another query. When combined with a join, subqueries can filter data or perform calculations before joining with other tables.
Example:
SELECT e.employee_id, e.employee_name, d.department_name FROM employees e JOIN ( SELECT department_id, department_name FROM departments WHERE location_id = 100 ) d ON e.department_id = d.department_id;
The subquery selects departments in a specific location. The main query joins this result with employees
to get employee details and department names.
When joining three tables, specify join conditions to ensure correct row combinations.
Example:
SELECT a.column1 AS alias1, b.column2 AS alias2, c.column3 AS alias3 FROM table1 a INNER JOIN table2 b ON a.common_column = b.common_column INNER JOIN table3 c ON b.common_column = c.common_column;
table1
, table2
, and table3
are joined using INNER JOINs based on common columns. The query selects specific columns from each table with aliases for clarity.
An anti join finds records in one table without corresponding records in another. This is typically achieved using a LEFT JOIN with a WHERE clause.
Example:
SELECT a.* FROM TableA a LEFT JOIN TableB b ON a.id = b.id WHERE b.id IS NULL;
This query selects records from TableA
without matches in TableB
.
A semi join returns rows from one table where matches are found in another. Unlike an inner join, it does not return columns from the second table.
Example:
SELECT a.* FROM TableA a WHERE EXISTS ( SELECT 1 FROM TableB b WHERE a.id = b.id );
This query selects all columns from TableA
where there is at least one corresponding record in TableB
.
Recursive joins traverse hierarchical data structures, such as organizational charts. This is typically achieved using a Common Table Expression (CTE) with a recursive query.
Example:
WITH RECURSIVE EmployeeHierarchy AS ( SELECT EmployeeID, ManagerID, EmployeeName FROM Employees WHERE ManagerID IS NULL UNION ALL SELECT e.EmployeeID, e.ManagerID, e.EmployeeName FROM Employees e INNER JOIN EmployeeHierarchy eh ON e.ManagerID = eh.EmployeeID ) SELECT * FROM EmployeeHierarchy;
The CTE EmployeeHierarchy
is defined with an anchor member and a recursive member. The anchor selects top-level employees, and the recursive member joins Employees
with EmployeeHierarchy
to find all employees reporting to the current level. This continues until all hierarchy levels are traversed.
To solve a real-world problem involving multiple joins and conditions, consider finding employees in the ‘Sales’ department with a salary over $50,000.
SELECT e.name, d.department_name, s.salary FROM employees e JOIN departments d ON e.department_id = d.department_id JOIN salaries s ON e.employee_id = s.employee_id WHERE d.department_name = 'Sales' AND s.salary > 50000;
This query joins employees
with departments
and salaries
, filtering for ‘Sales’ department employees with salaries over $50,000.
To find duplicate records across two tables using a join, use an INNER JOIN to match records based on identical columns. Use a COUNT function to determine occurrences.
Example:
SELECT t1.column_name, COUNT(*) as duplicate_count FROM table1 t1 INNER JOIN table2 t2 ON t1.column_name = t2.column_name GROUP BY t1.column_name HAVING COUNT(*) > 1;
table1
and table2
are joined on specified columns to identify duplicates. The GROUP BY
clause groups results, and the HAVING
clause filters for more than one occurrence, indicating duplicates.