Page Content

Tutorials

How to use Analytical Function in WHERE Clause in Oracle?

Analytical Function in Oracle

The capabilities of standard aggregate functions are expanded by analytical functions, a potent feature in SQL that computes aggregate values over a set of rows while returning numerous rows for each group. Standard aggregate functions (such as SUM, AVG, and COUNT) collapse groupings of rows into a single summary row, meaning this is in stark contrast. Analytical functions are extremely helpful in data warehousing systems for complicated data analysis, reporting, and ranking scenarios.

A subset of the query’s result set, or the “window” of rows, is where analytical functions in Oracle SQL execute. Each row in the result set has a sliding window of rows established for it; the range of rows utilised to calculate the current row is determined by this window. Usually, these functions are used to compute reporting, centring, cumulative, and moving aggregates.

Key Components of Analytical Functions

The window (or group) of rows that the function operates on is defined by the OVER clause, which distinguishes analytical functions. A number of subclauses can be included in the OVER clause:

  1. PARTITION BY Clause: This clause divides the query result set into groups based on one or more specified expressions. If PARTITION BY is omitted, the function treats all rows of the query result set as a single group. This is similar to the GROUP BY clause for aggregate functions, but importantly, PARTITION BY does not reduce the number of rows returned by the query.
  2. ORDER BY Clause: This clause sorts the rows within each partition (or the entire result set if no PARTITION BY is used). The order of rows is crucial for many analytical functions, especially those that calculate rankings or rely on sequential processing, like LAG or LEAD.
  3. WINDOWING Clause: This optional clause further refines the window within each partition by specifying a physical or logical set of rows relative to the current row. It can be defined using ROWS (physical units) or RANGE (logical offset) and typically uses keywords like BETWEEN, PRECEDING, FOLLOWING, UNBOUNDED PRECEDING, UNBOUNDED FOLLOWING, and CURRENT ROW. If the WINDOWING clause is omitted, the default is RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW.

Execution Flow

Analysis follows the FROM, WHERE, GROUP BY, and HAVING clauses but precedes the ORDER BY phrase. This allows you to filter analytical function results by nesting them in a subquery and filtering that subquery.

Instances Apart From Aggregates

A practical example will demonstrate analytical functions. The EMPLOYEE_SALES table will contain employee sales data, including department, date, and amount.

CREATE TABLE EMPLOYEE_SALES (
    EMPLOYEE_ID    NUMBER(6) PRIMARY KEY,
    FIRST_NAME     VARCHAR2(20) NOT NULL,
    LAST_NAME      VARCHAR2(25) NOT NULL,
    DEPARTMENT     VARCHAR2(30),
    SALES_DATE     DATE NOT NULL,
    SALES_AMOUNT   NUMBER(10, 2) NOT NULL
);
INSERT INTO EMPLOYEE_SALES (EMPLOYEE_ID, FIRST_NAME, LAST_NAME, DEPARTMENT, SALES_DATE, SALES_AMOUNT) VALUES (101, 'John', 'Doe', 'Electronics', TO_DATE('2023-01-15', 'YYYY-MM-DD'), 1200.00);
INSERT INTO EMPLOYEE_SALES (EMPLOYEE_ID, FIRST_NAME, LAST_NAME, DEPARTMENT, SALES_DATE, SALES_AMOUNT) VALUES (102, 'Jane', 'Smith', 'Electronics', TO_DATE('2023-01-20', 'YYYY-MM-DD'), 1500.00);
INSERT INTO EMPLOYEE_SALES (EMPLOYEE_ID, FIRST_NAME, LAST_NAME, DEPARTMENT, SALES_DATE, SALES_AMOUNT) VALUES (103, 'Peter', 'Jones', 'Clothing', TO_DATE('2023-01-10', 'YYYY-MM-DD'), 800.00);
INSERT INTO EMPLOYEE_SALES (EMPLOYEE_ID, FIRST_NAME, LAST_NAME, DEPARTMENT, SALES_DATE, SALES_AMOUNT) VALUES (104, 'Mary', 'Brown', 'Electronics', TO_DATE('2023-02-01', 'YYYY-MM-DD'), 1300.00);
INSERT INTO EMPLOYEE_SALES (EMPLOYEE_ID, FIRST_NAME, LAST_NAME, DEPARTMENT, SALES_DATE, SALES_AMOUNT) VALUES (105, 'David', 'Green', 'Clothing', TO_DATE('2023-02-05', 'YYYY-MM-DD'), 950.00);
INSERT INTO EMPLOYEE_SALES (EMPLOYEE_ID, FIRST_NAME, LAST_NAME, DEPARTMENT, SALES_DATE, SALES_AMOUNT) VALUES (106, 'Susan', 'White', 'Electronics', TO_DATE('2023-02-10', 'YYYY-MM-DD'), 1600.00);
INSERT INTO EMPLOYEE_SALES (EMPLOYEE_ID, FIRST_NAME, LAST_NAME, DEPARTMENT, SALES_DATE, SALES_AMOUNT) VALUES (107, 'Chris', 'Black', 'Books', TO_DATE('2023-01-25', 'YYYY-MM-DD'), 600.00);
INSERT INTO EMPLOYEE_SALES (EMPLOYEE_ID, FIRST_NAME, LAST_NAME, DEPARTMENT, SALES_DATE, SALES_AMOUNT) VALUES (108, 'Anna', 'King', 'Books', TO_DATE('2023-02-15', 'YYYY-MM-DD'), 750.00);
COMMIT;

Analytical Functions

Calculating Cumulative Sales by Department

We would like to view the sales of each employee as well as the running total of sales in their department for each date of sale. The usage of SUM as an analytical function is demonstrated here.

SELECT
    EMPLOYEE_ID,
    FIRST_NAME,
    LAST_NAME,
    DEPARTMENT,
    SALES_DATE,
    SALES_AMOUNT,
    SUM(SALES_AMOUNT) OVER (PARTITION BY DEPARTMENT ORDER BY SALES_DATE) AS CUMULATIVE_DEPARTMENT_SALES
FROM
    EMPLOYEE_SALES
ORDER BY
    DEPARTMENT, SALES_DATE;

Output:

EMPLOYEE_ID FIRST_NAME LAST_NAME  DEPARTMENT   SALES_DATE SALES_AMOUNT CUMULATIVE_DEPARTMENT_SALES
----------- ---------- ---------- ------------ ---------- ------------ ---------------------------
107         Chris      Black      Books        2023-01-25         600.00                      600.00
108         Anna       King       Books        2023-02-15         750.00                     1350.00
103         Peter      Jones      Clothing     2023-01-10         800.00                      800.00
105         David      Green      Clothing     2023-02-05         950.00                     1750.00
101         John       Doe        Electronics  2023-01-15        1200.00                     1200.00
102         Jane       Smith      Electronics  2023-01-20        1500.00                     2700.00
104         Mary       Brown      Electronics  2023-02-01        1300.00                     4000.00
106         Susan      White      Electronics  2023-02-10        1600.00                     5600.00

Each sales record is displayed in detail in this output, along with a new column called CUMULATIVE_DEPARTMENT_SALES that gives a running total for each department as opposed to a single total for all departments.

Ranking Employees by Sales Amount within Each Department

We wish to rate each worker according to the quantity of sales they make in their department. In order to award consecutive rankings without gaps in the event of a tie, we will employ DENSE_RANK.

SELECT
    EMPLOYEE_ID,
    FIRST_NAME,
    LAST_NAME,
    DEPARTMENT,
    SALES_DATE,
    SALES_AMOUNT,
    DENSE_RANK() OVER (PARTITION BY DEPARTMENT ORDER BY SALES_AMOUNT DESC) AS RANK_IN_DEPARTMENT
FROM
    EMPLOYEE_SALES
ORDER BY
    DEPARTMENT, RANK_IN_DEPARTMENT;

Output:

EMPLOYEE_ID FIRST_NAME LAST_NAME  DEPARTMENT   SALES_DATE SALES_AMOUNT RANK_IN_DEPARTMENT
----------- ---------- ---------- ------------ ---------- ------------ ------------------
108         Anna       King       Books        2023-02-15         750.00                  1
107         Chris      Black      Books        2023-01-25         600.00                  2
105         David      Green      Clothing     2023-02-05         950.00                  1
103         Peter      Jones      Clothing     2023-01-10         800.00                  2
106         Susan      White      Electronics  2023-02-10        1600.00                  1
102         Jane       Smith      Electronics  2023-01-20        1500.00                  2
104         Mary       Brown      Electronics  2023-02-01        1300.00                  3
101         John       Doe        Electronics  2023-01-15        1200.00                  4

The sales rank for each employee within their respective department is displayed here by RANK_IN_DEPARTMENT, once more without lowering the amount of rows from the original table.

Comparing Current Sales to Previous Sales

We wish to view the sales of each employee as well as the sales amount of the employee in the same department who made a sale right before them. This illustrates how the LAG function retrieves information from a previous row in the window.

SELECT
    EMPLOYEE_ID,
    FIRST_NAME,
    LAST_NAME,
    DEPARTMENT,
    SALES_DATE,
    SALES_AMOUNT,
    LAG(SALES_AMOUNT, 1, 0) OVER (PARTITION BY DEPARTMENT ORDER BY SALES_DATE) AS PREVIOUS_SALES_AMOUNT
FROM
    EMPLOYEE_SALES
ORDER BY
    DEPARTMENT, SALES_DATE;

Output:

EMPLOYEE_ID FIRST_NAME LAST_NAME  DEPARTMENT   SALES_DATE SALES_AMOUNT PREVIOUS_SALES_AMOUNT
----------- ---------- ---------- ------------ ---------- ------------ ---------------------
107         Chris      Black      Books        2023-01-25         600.00                   0.00
108         Anna       King       Books        2023-02-15         750.00                 600.00
103         Peter      Jones      Clothing     2023-01-10         800.00                   0.00
105         David      Green      Clothing     2023-02-05         950.00                 800.00
101         John       Doe        Electronics  2023-01-15        1200.00                   0.00
102         Jane       Smith      Electronics  2023-01-20        1500.00                1200.00
104         Mary       Brown      Electronics  2023-02-01        1300.00                1500.00
106         Susan      White      Electronics  2023-02-10        1600.00                1300.00

SALES_DATE determines PREVIOUS_SALES_AMOUNT displays the sales amount from the row before the current row in the same department. Each department’s first sale returns 0 because there is no prior row.

Analytical functions offer detailed insights without aggregating the result set into a single summary record by performing row-level calculations on groups of rows. They simplify difficult reporting chores that would normally need multi-step queries or external processing by allowing sophisticated SQL analysis. Analytical operations in Oracle Database include statistics and ranking tools.

Index