What is window function?
SQL window functions perform calculations across a set of table rows that are related to the current row, without collapsing them into a single output row. They are distinct from standard aggregate functions because they return a value for *each* row, allowing for complex analytical queries while maintaining row granularity.
What is a Window Function?
A window function operates on a 'window' of rows defined by the OVER clause. Unlike traditional aggregate functions (e.g., SUM, AVG, COUNT) that group rows and return a single result per group, window functions return a result for *each individual row* in the result set. This allows them to perform calculations like ranking, moving averages, or comparing a row to preceding or following rows, while still retaining all original rows in the output.
Key Components of the OVER Clause
PARTITION BY: Divides the query result set into partitions to which the window function is applied. The function restarts its calculation for each new partition.ORDER BY: Defines the logical order of rows within each partition. This is crucial for functions that depend on row sequence, such asROW_NUMBER,RANK,LAG, andLEAD.ROWSorRANGEclause (frame specification): Further restricts the rows within the current partition that are included in the window. For example,ROWS BETWEEN 1 PRECEDING AND CURRENT ROWwould include the current row and the one immediately before it.
Common Types of Window Functions
- Ranking Functions:
ROW_NUMBER(),RANK(),DENSE_RANK(),NTILE() - Analytic Functions:
LAG(),LEAD(),FIRST_VALUE(),LAST_VALUE(),NTH_VALUE() - Aggregate Functions (used with
OVER):SUM() OVER(...),AVG() OVER(...),COUNT() OVER(...),MAX() OVER(...),MIN() OVER(...)
Example: Ranking Sales per Department
Let's say we have an employee_sales table with department, employee_name, and sales_amount, and we want to rank employees within each department based on their sales figures.
SELECT
department,
employee_name,
sales_amount,
RANK() OVER (PARTITION BY department ORDER BY sales_amount DESC) AS sales_rank_in_dept
FROM
employee_sales;
In this example:
- PARTITION BY department ensures that the ranking restarts for each new department.
- ORDER BY sales_amount DESC ranks employees from highest to lowest sales within their respective departments.
- The RANK() function assigns a rank to each employee based on their sales, with ties receiving the same rank and subsequent ranks skipping numbers.
Benefits of Using Window Functions
- Simplified Queries: Perform complex analytical calculations (like running totals, moving averages, or comparisons between rows) without the need for cumbersome self-joins or subqueries.
- Maintain Row Granularity: Unlike standard aggregate functions that collapse rows into groups, window functions return a result for each row, preserving the detail of your dataset.
- Efficiency: Often more performant than equivalent queries written with subqueries or joins for similar analytical tasks.