10 SQL Window Function Interview Questions and Answers
Prepare for your SQL interview with this guide on SQL Window Functions, featuring common questions and detailed answers to enhance your skills.
Prepare for your SQL interview with this guide on SQL Window Functions, featuring common questions and detailed answers to enhance your skills.
SQL Window Functions are a powerful feature in SQL that allow for advanced data analysis and manipulation. Unlike traditional aggregate functions, window functions perform calculations across a set of table rows that are somehow related to the current row. This makes them invaluable for tasks such as running totals, moving averages, and ranking operations, which are essential in data analytics and reporting.
This article aims to prepare you for technical interviews by providing a curated list of questions and answers focused on SQL Window Functions. By understanding these concepts and practicing the provided examples, you’ll be better equipped to demonstrate your proficiency and problem-solving skills in SQL during your interview.
The ROW_NUMBER(), RANK(), and DENSE_RANK() functions in SQL assign numbers to rows within a partition of a result set, differing in their handling of ties.
Example:
SELECT employee_id, salary, ROW_NUMBER() OVER (ORDER BY salary DESC) AS row_num, RANK() OVER (ORDER BY salary DESC) AS rank, DENSE_RANK() OVER (ORDER BY salary DESC) AS dense_rank FROM employees;
In this example, if two employees have the same salary, ROW_NUMBER() assigns unique numbers, while RANK() and DENSE_RANK() handle ties differently.
The PARTITION BY clause divides the result set into partitions for calculations. For instance, to calculate total sales for each salesperson within their regions:
SELECT salesperson, region, sales_amount, SUM(sales_amount) OVER (PARTITION BY region) AS total_sales_by_region FROM sales;
Here, the data is partitioned by region, and the SUM function calculates total sales within each region.
The ORDER BY clause in a window function specifies the order of rows within each partition, essential for functions like ROW_NUMBER(), RANK(), and DENSE_RANK().
Example:
SELECT employee_id, salary, RANK() OVER (ORDER BY salary DESC) AS salary_rank FROM employees;
Here, the ORDER BY clause orders rows by salary in descending order, affecting the rank assignment.
LAG() and LEAD() access data from a previous or subsequent row in the result set, useful for comparing values in a sequence.
Example:
SELECT employee_id, salary, LAG(salary, 1) OVER (ORDER BY employee_id) AS previous_salary, LEAD(salary, 1) OVER (ORDER BY employee_id) AS next_salary FROM employees;
LAG() retrieves the previous row’s salary, while LEAD() retrieves the next row’s salary.
To calculate the difference in sales between the current and previous month for each product:
SELECT product_id, month, sales, sales - LAG(sales, 1) OVER (PARTITION BY product_id ORDER BY month) AS sales_difference FROM sales_data;
LAG() accesses the previous month’s sales, with calculations done separately for each product.
Frame specification in window functions determines the set of rows considered for calculations, using clauses like ROWS
or RANGE
.
Example:
SELECT employee_id, salary, SUM(salary) OVER ( PARTITION BY department_id ORDER BY employee_id ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING ) AS moving_sum FROM employees;
Here, the frame includes the current row plus one preceding and one following row.
To find the top 3 highest-paid employees in each department:
SELECT department, employee_name, salary FROM ( SELECT department, employee_name, salary, ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) as rn FROM employees ) subquery WHERE rn <= 3;
ROW_NUMBER() ranks employees by salary within each department, and the outer query filters the top 3.
To calculate a moving average of sales over the last 3 months for each product:
SELECT product_id, sale_date, sales, AVG(sales) OVER ( PARTITION BY product_id ORDER BY sale_date ROWS BETWEEN 2 PRECEDING AND CURRENT ROW ) AS moving_avg_sales FROM sales_table;
Handling NULL values in window functions can be done using COALESCE
, IFNULL
, or CASE
statements to replace NULLs with specified values.
Example:
SELECT employee_id, department, salary, RANK() OVER ( PARTITION BY department ORDER BY COALESCE(salary, 0) DESC ) as rank FROM employees;
COALESCE replaces NULL salary values with 0 before applying RANK().
To calculate the cumulative distribution of salaries within each department:
SELECT department_id, employee_id, salary, CUME_DIST() OVER (PARTITION BY department_id ORDER BY salary) AS cumulative_distribution FROM employees;
CUME_DIST() calculates the cumulative distribution of salary within each department.