🗄️ SQL Q88 / 127

What is lateral join?

AI-Powered Answer ✓ Answered

A LATERAL JOIN, available in databases like PostgreSQL, Oracle, and SQL Server (as APPLY), allows a subquery or a table function to reference columns from the 'left-hand side' table of the join. This enables row-by-row processing where the right-hand side of the join is evaluated for each row of the left-hand side, similar to a correlated subquery but with more flexibility and often better performance.

What is a Lateral Join?

In standard SQL joins (INNER, LEFT, etc.), the right-hand side of the join is typically evaluated independently of the left-hand side, or its predicates only refer to columns already present in its own scope or the overall query's outer scope. A LATERAL JOIN breaks this rule by explicitly allowing the subquery or table function on the right to reference columns from the table on its immediate left in the JOIN clause. This creates a dependency where the right side's result set can change based on the specific row being processed from the left side.

It's conceptually similar to a correlated subquery that runs for each row of the outer query, but instead of returning a single scalar value, a LATERAL JOIN (or APPLY) returns a set of rows (a table) for each row of the left table. This makes it powerful for tasks like finding the 'top N' related rows for each row in a primary table.

Why Use Lateral Join?

  • Top N per Group: Easily retrieve the top N rows from a related table for each row in the main table (e.g., the 3 most recent orders for each customer).
  • Complex Filtering: Apply complex logic or calculations on a per-row basis where a simple JOIN condition or standard subquery might be insufficient.
  • Table Functions: Use table-valued functions (TVFs) that require parameters from the driving table (SQL Server's APPLY is particularly strong here).
  • Readability and Performance: Often provides a more readable and sometimes more performant alternative to complex window functions or deeply nested correlated subqueries for specific problems.

Syntax (PostgreSQL Example)

sql
SELECT ...
FROM table1
[INNER | LEFT] JOIN LATERAL (
    SELECT ...
    FROM table2
    WHERE table2.column = table1.column -- Reference to table1
    ORDER BY ...
    LIMIT N
) AS alias_for_subquery ON TRUE;

Example Scenario: Top 2 Products per Category

Imagine we have a list of products, each belonging to a category, and we want to find the top 2 most expensive products within each category.

Sample Data

sql
CREATE TABLE categories (
    category_id SERIAL PRIMARY KEY,
    category_name VARCHAR(50) NOT NULL
);

CREATE TABLE products (
    product_id SERIAL PRIMARY KEY,
    product_name VARCHAR(100) NOT NULL,
    category_id INT NOT NULL REFERENCES categories(category_id),
    price DECIMAL(10, 2) NOT NULL
);

INSERT INTO categories (category_name) VALUES ('Electronics'), ('Books'), ('Clothing');

INSERT INTO products (product_name, category_id, price) VALUES
('Laptop', 1, 1200.00),
('Smartphone', 1, 800.00),
('Tablet', 1, 500.00),
('Headphones', 1, 150.00),
('Data Science Book', 2, 60.00),
('Fantasy Novel', 2, 25.00),
('Cookbook', 2, 35.00),
('T-Shirt', 3, 20.00),
('Jeans', 3, 75.00),
('Jacket', 3, 150.00),
('Socks', 3, 10.00);

Using Lateral Join

Here's how to achieve the 'top 2 most expensive products per category' using a LATERAL JOIN. The subquery on the right (aliased as 'top_products') is executed for each category row, specifically filtering for products belonging to that category and returning the top 2 by price.

sql
SELECT
    c.category_name,
    tp.product_name,
    tp.price
FROM
    categories AS c
JOIN LATERAL (
    SELECT
        p.product_name,
        p.price
    FROM
        products AS p
    WHERE
        p.category_id = c.category_id -- Reference to 'c' (left table)
    ORDER BY
        p.price DESC
    LIMIT 2
) AS tp ON TRUE
ORDER BY
    c.category_name, tp.price DESC;

Equivalent using CROSS APPLY (SQL Server)

sql
SELECT
    c.category_name,
    tp.product_name,
    tp.price
FROM
    categories AS c
CROSS APPLY (
    SELECT TOP 2
        p.product_name,
        p.price
    FROM
        products AS p
    WHERE
        p.category_id = c.category_id
    ORDER BY
        p.price DESC
) AS tp
ORDER BY
    c.category_name, tp.price DESC;

Key Characteristics

  • Contextual Evaluation: The right-hand side (subquery) sees columns from the current row of the left-hand side.
  • Row-wise Processing: Conceptually, it iterates through each row of the left table, executing the lateral subquery for each.
  • Returns Tables: Unlike a scalar correlated subquery, a lateral join returns a *set* of rows for each left-hand row.
  • JOIN Type: Can be INNER JOIN LATERAL (similar to CROSS APPLY in SQL Server, requiring the subquery to return at least one row for a match) or LEFT JOIN LATERAL (similar to OUTER APPLY in SQL Server, returning nulls if the subquery returns no rows for a given left-hand row).
  • Aliases are Crucial: The subquery on the right side *must* be given an alias.