Interview

10 SQL Window Function Interview Questions and Answers

Prepare for your SQL interview with this guide on SQL Window Functions, featuring common questions and detailed answers to enhance your skills.

SQL Window Functions are a powerful feature in SQL that allow for advanced data analysis and manipulation. Unlike traditional aggregate functions, window functions perform calculations across a set of table rows that are somehow related to the current row. This makes them invaluable for tasks such as running totals, moving averages, and ranking operations, which are essential in data analytics and reporting.

This article aims to prepare you for technical interviews by providing a curated list of questions and answers focused on SQL Window Functions. By understanding these concepts and practicing the provided examples, you’ll be better equipped to demonstrate your proficiency and problem-solving skills in SQL during your interview.

SQL Window Function Interview Questions and Answers

1. Describe the difference between ROW_NUMBER(), RANK(), and DENSE_RANK() functions.

The ROW_NUMBER(), RANK(), and DENSE_RANK() functions in SQL assign numbers to rows within a partition of a result set, differing in their handling of ties.

  • ROW_NUMBER(): Assigns a unique number to each row, starting at 1 for the first row in each partition. Ties are handled arbitrarily.
  • RANK(): Assigns the same rank to tied rows, with the next rank incremented by the number of tied rows.
  • DENSE_RANK(): Similar to RANK(), but the next rank is incremented by 1 regardless of the number of tied rows.

Example:

SELECT 
    employee_id, 
    salary,
    ROW_NUMBER() OVER (ORDER BY salary DESC) AS row_num,
    RANK() OVER (ORDER BY salary DESC) AS rank,
    DENSE_RANK() OVER (ORDER BY salary DESC) AS dense_rank
FROM 
    employees;

In this example, if two employees have the same salary, ROW_NUMBER() assigns unique numbers, while RANK() and DENSE_RANK() handle ties differently.

2. How would you use the PARTITION BY clause in a window function? Provide an example scenario.

The PARTITION BY clause divides the result set into partitions for calculations. For instance, to calculate total sales for each salesperson within their regions:

SELECT 
    salesperson,
    region,
    sales_amount,
    SUM(sales_amount) OVER (PARTITION BY region) AS total_sales_by_region
FROM 
    sales;

Here, the data is partitioned by region, and the SUM function calculates total sales within each region.

3. What is the purpose of the ORDER BY clause in a window function?

The ORDER BY clause in a window function specifies the order of rows within each partition, essential for functions like ROW_NUMBER(), RANK(), and DENSE_RANK().

Example:

SELECT 
    employee_id,
    salary,
    RANK() OVER (ORDER BY salary DESC) AS salary_rank
FROM 
    employees;

Here, the ORDER BY clause orders rows by salary in descending order, affecting the rank assignment.

4. Explain the difference between LAG() and LEAD() functions.

LAG() and LEAD() access data from a previous or subsequent row in the result set, useful for comparing values in a sequence.

  • LAG(): Accesses a row at a specified offset before the current row.
  • LEAD(): Accesses a row at a specified offset after the current row.

Example:

SELECT 
    employee_id,
    salary,
    LAG(salary, 1) OVER (ORDER BY employee_id) AS previous_salary,
    LEAD(salary, 1) OVER (ORDER BY employee_id) AS next_salary
FROM 
    employees;

LAG() retrieves the previous row’s salary, while LEAD() retrieves the next row’s salary.

5. Write a query to calculate the difference in sales between the current month and the previous month for each product.

To calculate the difference in sales between the current and previous month for each product:

SELECT 
    product_id,
    month,
    sales,
    sales - LAG(sales, 1) OVER (PARTITION BY product_id ORDER BY month) AS sales_difference
FROM 
    sales_data;

LAG() accesses the previous month’s sales, with calculations done separately for each product.

6. Explain the concept of frame specification in window functions.

Frame specification in window functions determines the set of rows considered for calculations, using clauses like ROWS or RANGE.

Example:

SELECT 
    employee_id,
    salary,
    SUM(salary) OVER (
        PARTITION BY department_id 
        ORDER BY employee_id 
        ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING
    ) AS moving_sum
FROM 
    employees;

Here, the frame includes the current row plus one preceding and one following row.

7. Write a query to find the top 3 highest-paid employees in each department.

To find the top 3 highest-paid employees in each department:

SELECT department, employee_name, salary
FROM (
    SELECT department, employee_name, salary,
           ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) as rn
    FROM employees
) subquery
WHERE rn <= 3;

ROW_NUMBER() ranks employees by salary within each department, and the outer query filters the top 3.

8. Write a query to calculate a moving average of sales over the last 3 months for each product.

To calculate a moving average of sales over the last 3 months for each product:

SELECT 
    product_id,
    sale_date,
    sales,
    AVG(sales) OVER (
        PARTITION BY product_id 
        ORDER BY sale_date 
        ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
    ) AS moving_avg_sales
FROM 
    sales_table;

9. How would you handle NULL values in window functions? Provide an example.

Handling NULL values in window functions can be done using COALESCE, IFNULL, or CASE statements to replace NULLs with specified values.

Example:

SELECT 
    employee_id,
    department,
    salary,
    RANK() OVER (
        PARTITION BY department 
        ORDER BY COALESCE(salary, 0) DESC
    ) as rank
FROM 
    employees;

COALESCE replaces NULL salary values with 0 before applying RANK().

10. Write a query to calculate the cumulative distribution of salaries within each department.

To calculate the cumulative distribution of salaries within each department:

SELECT 
    department_id,
    employee_id,
    salary,
    CUME_DIST() OVER (PARTITION BY department_id ORDER BY salary) AS cumulative_distribution
FROM 
    employees;

CUME_DIST() calculates the cumulative distribution of salary within each department.

Previous

10 Crypto Interview Questions and Answers

Back to Interview
Next

15 Apache Tomcat Interview Questions and Answers