30 SQL Query Interview Questions and Answers
Prepare for your next interview with our comprehensive guide on SQL queries, featuring common questions and detailed answers to boost your database skills.
Prepare for your next interview with our comprehensive guide on SQL queries, featuring common questions and detailed answers to boost your database skills.
SQL (Structured Query Language) is the backbone of database management and manipulation. It is essential for querying, updating, and managing data in relational databases, making it a critical skill for roles involving data analysis, backend development, and database administration. SQL’s versatility and efficiency in handling large datasets have made it a staple in the tech industry.
This article offers a curated selection of SQL query questions designed to test and enhance your understanding of database concepts and query optimization. By working through these examples, you will be better prepared to demonstrate your SQL proficiency and problem-solving abilities in an interview setting.
To find employees whose salary is greater than 50000, use a SQL SELECT statement with a WHERE clause. Assuming a table named employees
with a column salary
, the query would be:
SELECT * FROM employees WHERE salary > 50000;
This selects all columns from the employees
table where the salary
exceeds 50000.
To count the number of employees in each department, use the SQL GROUP BY
clause with the COUNT
function. This groups results by department and counts employees in each group.
Example:
SELECT department_id, COUNT(*) AS employee_count FROM employees GROUP BY department_id;
This selects the department_id
and counts employees in each department.
To find departments with more than 10 employees, use the GROUP BY
clause with the HAVING
clause. The HAVING
clause filters groups based on a condition.
Example:
SELECT department_id, COUNT(*) as employee_count FROM employees GROUP BY department_id HAVING COUNT(*) > 10;
This query selects the department ID and counts employees, filtering for departments with more than 10 employees.
To join the ’employees’ and ‘departments’ tables on the department_id column, use an SQL JOIN operation. This combines rows from both tables based on a related column.
Example:
SELECT employees.*, departments.* FROM employees JOIN departments ON employees.department_id = departments.department_id;
To retrieve all employees and their respective departments, including those without a department, use a LEFT JOIN. This ensures all employees are included, even without a corresponding department.
SELECT employees.employee_id, employees.employee_name, departments.department_name FROM employees LEFT JOIN departments ON employees.department_id = departments.department_id;
To retrieve all departments and their respective employees, including those without employees, use a LEFT JOIN. This returns all records from departments and matched records from employees.
Example:
SELECT d.department_name, e.employee_name FROM departments d LEFT JOIN employees e ON d.department_id = e.department_id;
To retrieve all employees and departments, including those without a match in either table, use a FULL OUTER JOIN. This ensures all records from both tables are included.
Example:
SELECT employees.employee_id, employees.employee_name, departments.department_id, departments.department_name FROM employees FULL OUTER JOIN departments ON employees.department_id = departments.department_id;
To find employees who have the same manager, use a self-join on the employee table. This compares manager IDs of different employees.
Example:
SELECT e1.employee_id, e1.employee_name, e2.manager_id FROM employees e1 JOIN employees e2 ON e1.manager_id = e2.manager_id WHERE e1.employee_id != e2.employee_id ORDER BY e2.manager_id;
To find employees who earn more than the average salary in their department, use a subquery to calculate the average salary for each department and compare each employee’s salary to this average.
SELECT e.employee_id, e.employee_name, e.salary, e.department_id FROM employees e WHERE e.salary > ( SELECT AVG(e2.salary) FROM employees e2 WHERE e2.department_id = e.department_id );
To find employees whose salary is higher than the average salary of their department, use a correlated subquery. This subquery uses values from the outer query and is executed for each row selected by the outer query.
Example:
SELECT e1.employee_id, e1.employee_name, e1.salary, e1.department_id FROM employees e1 WHERE e1.salary > ( SELECT AVG(e2.salary) FROM employees e2 WHERE e2.department_id = e1.department_id );
To optimize a query with multiple joins and large datasets, consider strategies like indexing, efficient query structure, and analyzing the query execution plan. Indexing can speed up data retrieval, while a well-structured query reduces data processing. Analyzing the execution plan helps identify bottlenecks.
To rank employees based on their salary within their department, use the RANK() window function. This assigns a unique rank to each row within a partition of a result set.
Example:
SELECT employee_id, department_id, salary, RANK() OVER (PARTITION BY department_id ORDER BY salary DESC) AS salary_rank FROM employees;
A Common Table Expression (CTE) is a temporary result set referenced within a SELECT, INSERT, UPDATE, or DELETE statement. CTEs simplify complex queries. Here’s how to create a CTE to list employees and their managers:
WITH EmployeeCTE AS ( SELECT e.EmployeeID, e.EmployeeName, m.ManagerID, m.ManagerName FROM Employees e LEFT JOIN Employees m ON e.ManagerID = m.EmployeeID ) SELECT EmployeeID, EmployeeName, ManagerID, ManagerName FROM EmployeeCTE;
A recursive Common Table Expression (CTE) performs recursive queries, useful for hierarchical data like organizational structures. It consists of an anchor member and a recursive member.
Example:
WITH RECURSIVE EmployeeHierarchy AS ( SELECT EmployeeID, ManagerID, EmployeeName, 1 AS Level FROM Employees WHERE ManagerID IS NULL UNION ALL SELECT e.EmployeeID, e.ManagerID, e.EmployeeName, eh.Level + 1 FROM Employees e INNER JOIN EmployeeHierarchy eh ON e.ManagerID = eh.EmployeeID ) SELECT EmployeeID, ManagerID, EmployeeName, Level FROM EmployeeHierarchy;
Indexing improves the speed of data retrieval operations. It allows the database to find and access rows more quickly. However, indexes require additional storage and can slow down write operations.
To identify and optimize a slow-running SQL query, analyze the query execution plan, ensure proper indexing, and refactor the query if necessary. Avoid full table scans and consider database configuration adjustments.
Example:
-- Original slow query SELECT * FROM orders WHERE customer_id = 12345 AND order_date > '2023-01-01'; -- Optimized query with indexing CREATE INDEX idx_customer_order ON orders (customer_id, order_date); EXPLAIN SELECT * FROM orders WHERE customer_id = 12345 AND order_date > '2023-01-01';
Transactions in SQL manage sequences of operations as a single unit. Key statements are BEGIN TRANSACTION, COMMIT, and ROLLBACK.
Example:
BEGIN TRANSACTION; UPDATE accounts SET balance = balance - 100 WHERE account_id = 1; UPDATE accounts SET balance = balance + 100 WHERE account_id = 2; IF @@ERROR <> 0 BEGIN ROLLBACK; PRINT 'Transaction failed and rolled back'; END ELSE BEGIN COMMIT; PRINT 'Transaction succeeded and committed'; END;
A trigger in SQL is a stored procedure that runs automatically when certain events occur. To create a trigger that updates an audit table when an employee’s salary is updated:
CREATE TRIGGER update_salary_audit AFTER UPDATE ON employees FOR EACH ROW WHEN (OLD.salary IS DISTINCT FROM NEW.salary) BEGIN INSERT INTO salary_audit (employee_id, old_salary, new_salary, change_date) VALUES (OLD.employee_id, OLD.salary, NEW.salary, CURRENT_TIMESTAMP); END;
To create a stored procedure that inserts a new employee record into the ’employees’ table:
CREATE PROCEDURE InsertEmployee @FirstName NVARCHAR(50), @LastName NVARCHAR(50), @Email NVARCHAR(100), @HireDate DATE AS BEGIN INSERT INTO employees (FirstName, LastName, Email, HireDate) VALUES (@FirstName, @LastName, @Email, @HireDate); END;
User-defined functions (UDFs) encapsulate reusable logic. To calculate the tenure of an employee, create a UDF that takes the employee’s start date as input and returns the number of years with the company.
Example:
CREATE FUNCTION CalculateTenure(@StartDate DATE) RETURNS INT AS BEGIN DECLARE @Tenure INT; SET @Tenure = DATEDIFF(YEAR, @StartDate, GETDATE()); RETURN @Tenure; END;
Use this function in queries to calculate employee tenure:
SELECT EmployeeID, Name, dbo.CalculateTenure(StartDate) AS Tenure FROM Employees;
To handle NULL values when calculating the total salary of employees, use the COALESCE function. This replaces NULL values with zero, ensuring they don’t affect the total salary calculation.
Example:
SELECT SUM(COALESCE(salary, 0)) AS total_salary FROM employees;
Dynamic SQL constructs SQL statements at runtime. Use EXEC
or sp_executesql
to execute dynamic SQL. sp_executesql
allows for parameterized queries, enhancing security.
Example:
DECLARE @TableName NVARCHAR(50) = 'Employees' DECLARE @SQL NVARCHAR(MAX) SET @SQL = N'SELECT * FROM ' + @TableName EXEC sp_executesql @SQL
Error handling in SQL uses the TRY…CATCH construct to manage exceptions. This allows for actions like logging errors or rolling back transactions.
Example:
BEGIN TRY -- Start a transaction BEGIN TRANSACTION -- Execute a query that might cause an error INSERT INTO Employees (EmployeeID, Name, Position) VALUES (1, 'John Doe', 'Manager') -- Commit the transaction if no error occurs COMMIT TRANSACTION END TRY BEGIN CATCH -- Rollback the transaction if an error occurs ROLLBACK TRANSACTION -- Capture and display the error message SELECT ERROR_MESSAGE() AS ErrorMessage END CATCH
To parse JSON data stored in a column, use built-in JSON functions. These functions extract and manipulate JSON data within SQL queries.
For example, in PostgreSQL:
SELECT json_column->>'key' AS extracted_value FROM your_table;
In MySQL:
SELECT JSON_EXTRACT(json_column, '$.key') AS extracted_value FROM your_table;
In SQL Server:
SELECT JSON_VALUE(json_column, '$.key') AS extracted_value FROM your_table;
To extract information from an XML column, use built-in XML functions like value()
, query()
, and nodes()
.
Example:
DECLARE @xmlData XML SET @xmlData = ' <Employees> <Employee> <ID>1</ID> <Name>John Doe</Name> <Position>Manager</Position> </Employee> <Employee> <ID>2</ID> <Name>Jane Smith</Name> <Position>Developer</Position> </Employee> </Employees>' SELECT Employee.value('(ID)[1]', 'INT') AS EmployeeID, Employee.value('(Name)[1]', 'VARCHAR(100)') AS EmployeeName, Employee.value('(Position)[1]', 'VARCHAR(100)') AS EmployeePosition FROM @xmlData.nodes('/Employees/Employee') AS EmployeeTable(Employee)
Temporal tables track historical changes to data. They consist of a current table and a history table. To create a temporal table, define period start and end columns.
Example:
CREATE TABLE Employee ( EmployeeID INT PRIMARY KEY, Name NVARCHAR(100), Position NVARCHAR(100), Salary DECIMAL(10, 2), ValidFrom DATETIME2 GENERATED ALWAYS AS ROW START, ValidTo DATETIME2 GENERATED ALWAYS AS ROW END, PERIOD FOR SYSTEM_TIME (ValidFrom, ValidTo) ) WITH (SYSTEM_VERSIONING = ON (HISTORY_TABLE = dbo.EmployeeHistory));
To query data as it was at a specific time:
SELECT * FROM Employee FOR SYSTEM_TIME AS OF '2023-01-01 00:00:00';
Full-text search allows efficient text searching within a column. It indexes text data and provides advanced search capabilities. To perform a full-text search on a ‘job_description’ column, create a full-text index and use the CONTAINS
or FREETEXT
function.
Example:
-- Create a full-text index on the job_description column CREATE FULLTEXT INDEX ON jobs(job_description) KEY INDEX PK_jobs; -- Perform a full-text search for the term 'developer' SELECT job_id, job_title, job_description FROM jobs WHERE CONTAINS(job_description, 'developer');
SQL injection is a technique where an attacker manipulates a SQL query by injecting malicious code. To prevent it, use parameterized queries or prepared statements. These methods treat user input as data, not executable code.
Example using Python’s sqlite3
library:
import sqlite3 # Secure code conn = sqlite3.connect('example.db') cursor = conn.cursor() user_input = 1 query = "SELECT * FROM users WHERE id = ?" cursor.execute(query, (user_input,)) results = cursor.fetchall()
Window functions perform calculations across a set of table rows related to the current row. They are useful for tasks like calculating running totals and ranking.
Example:
SELECT employee_id, department_id, salary, RANK() OVER (PARTITION BY department_id ORDER BY salary DESC) as salary_rank FROM employees;
In this example, the RANK()
function partitions data by department_id
and orders rows by salary
.
Normalization divides a database into tables to eliminate redundant data and ensure data dependencies make sense. Denormalization merges tables to reduce joins, improving read performance. It can introduce redundancy, so it must be used carefully.