Analytical Function in Oracle
The capabilities of standard aggregate functions are expanded by analytical functions, a potent feature in SQL that computes aggregate values over a set of rows while returning numerous rows for each group. Standard aggregate functions (such as SUM
, AVG
, and COUNT
) collapse groupings of rows into a single summary row, meaning this is in stark contrast. Analytical functions are extremely helpful in data warehousing systems for complicated data analysis, reporting, and ranking scenarios.
A subset of the query’s result set, or the “window” of rows, is where analytical functions in Oracle SQL execute. Each row in the result set has a sliding window of rows established for it; the range of rows utilised to calculate the current row is determined by this window. Usually, these functions are used to compute reporting, centring, cumulative, and moving aggregates.
Key Components of Analytical Functions
The window (or group) of rows that the function operates on is defined by the OVER
clause, which distinguishes analytical functions. A number of subclauses can be included in the OVER
clause:
- PARTITION BY Clause: This clause divides the query result set into groups based on one or more specified expressions. If
PARTITION BY
is omitted, the function treats all rows of the query result set as a single group. This is similar to theGROUP BY
clause for aggregate functions, but importantly,PARTITION BY
does not reduce the number of rows returned by the query. - ORDER BY Clause: This clause sorts the rows within each partition (or the entire result set if no
PARTITION BY
is used). The order of rows is crucial for many analytical functions, especially those that calculate rankings or rely on sequential processing, likeLAG
orLEAD
. - WINDOWING Clause: This optional clause further refines the window within each partition by specifying a physical or logical set of rows relative to the current row. It can be defined using
ROWS
(physical units) orRANGE
(logical offset) and typically uses keywords likeBETWEEN
,PRECEDING
,FOLLOWING
,UNBOUNDED PRECEDING
,UNBOUNDED FOLLOWING
, andCURRENT ROW
. If theWINDOWING
clause is omitted, the default isRANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
.
Execution Flow
Analysis follows the FROM
, WHERE
, GROUP BY
, and HAVING
clauses but precedes the ORDER BY
phrase. This allows you to filter analytical function results by nesting them in a subquery and filtering that subquery.
Instances Apart From Aggregates
A practical example will demonstrate analytical functions. The EMPLOYEE_SALES
table will contain employee sales data, including department, date, and amount.
CREATE TABLE EMPLOYEE_SALES (
EMPLOYEE_ID NUMBER(6) PRIMARY KEY,
FIRST_NAME VARCHAR2(20) NOT NULL,
LAST_NAME VARCHAR2(25) NOT NULL,
DEPARTMENT VARCHAR2(30),
SALES_DATE DATE NOT NULL,
SALES_AMOUNT NUMBER(10, 2) NOT NULL
);
INSERT INTO EMPLOYEE_SALES (EMPLOYEE_ID, FIRST_NAME, LAST_NAME, DEPARTMENT, SALES_DATE, SALES_AMOUNT) VALUES (101, 'John', 'Doe', 'Electronics', TO_DATE('2023-01-15', 'YYYY-MM-DD'), 1200.00);
INSERT INTO EMPLOYEE_SALES (EMPLOYEE_ID, FIRST_NAME, LAST_NAME, DEPARTMENT, SALES_DATE, SALES_AMOUNT) VALUES (102, 'Jane', 'Smith', 'Electronics', TO_DATE('2023-01-20', 'YYYY-MM-DD'), 1500.00);
INSERT INTO EMPLOYEE_SALES (EMPLOYEE_ID, FIRST_NAME, LAST_NAME, DEPARTMENT, SALES_DATE, SALES_AMOUNT) VALUES (103, 'Peter', 'Jones', 'Clothing', TO_DATE('2023-01-10', 'YYYY-MM-DD'), 800.00);
INSERT INTO EMPLOYEE_SALES (EMPLOYEE_ID, FIRST_NAME, LAST_NAME, DEPARTMENT, SALES_DATE, SALES_AMOUNT) VALUES (104, 'Mary', 'Brown', 'Electronics', TO_DATE('2023-02-01', 'YYYY-MM-DD'), 1300.00);
INSERT INTO EMPLOYEE_SALES (EMPLOYEE_ID, FIRST_NAME, LAST_NAME, DEPARTMENT, SALES_DATE, SALES_AMOUNT) VALUES (105, 'David', 'Green', 'Clothing', TO_DATE('2023-02-05', 'YYYY-MM-DD'), 950.00);
INSERT INTO EMPLOYEE_SALES (EMPLOYEE_ID, FIRST_NAME, LAST_NAME, DEPARTMENT, SALES_DATE, SALES_AMOUNT) VALUES (106, 'Susan', 'White', 'Electronics', TO_DATE('2023-02-10', 'YYYY-MM-DD'), 1600.00);
INSERT INTO EMPLOYEE_SALES (EMPLOYEE_ID, FIRST_NAME, LAST_NAME, DEPARTMENT, SALES_DATE, SALES_AMOUNT) VALUES (107, 'Chris', 'Black', 'Books', TO_DATE('2023-01-25', 'YYYY-MM-DD'), 600.00);
INSERT INTO EMPLOYEE_SALES (EMPLOYEE_ID, FIRST_NAME, LAST_NAME, DEPARTMENT, SALES_DATE, SALES_AMOUNT) VALUES (108, 'Anna', 'King', 'Books', TO_DATE('2023-02-15', 'YYYY-MM-DD'), 750.00);
COMMIT;
Analytical Functions
Calculating Cumulative Sales by Department
We would like to view the sales of each employee as well as the running total of sales in their department for each date of sale. The usage of SUM
as an analytical function is demonstrated here.
SELECT
EMPLOYEE_ID,
FIRST_NAME,
LAST_NAME,
DEPARTMENT,
SALES_DATE,
SALES_AMOUNT,
SUM(SALES_AMOUNT) OVER (PARTITION BY DEPARTMENT ORDER BY SALES_DATE) AS CUMULATIVE_DEPARTMENT_SALES
FROM
EMPLOYEE_SALES
ORDER BY
DEPARTMENT, SALES_DATE;
Output:
EMPLOYEE_ID FIRST_NAME LAST_NAME DEPARTMENT SALES_DATE SALES_AMOUNT CUMULATIVE_DEPARTMENT_SALES
----------- ---------- ---------- ------------ ---------- ------------ ---------------------------
107 Chris Black Books 2023-01-25 600.00 600.00
108 Anna King Books 2023-02-15 750.00 1350.00
103 Peter Jones Clothing 2023-01-10 800.00 800.00
105 David Green Clothing 2023-02-05 950.00 1750.00
101 John Doe Electronics 2023-01-15 1200.00 1200.00
102 Jane Smith Electronics 2023-01-20 1500.00 2700.00
104 Mary Brown Electronics 2023-02-01 1300.00 4000.00
106 Susan White Electronics 2023-02-10 1600.00 5600.00
Each sales record is displayed in detail in this output, along with a new column called CUMULATIVE_DEPARTMENT_SALES
that gives a running total for each department as opposed to a single total for all departments.
Ranking Employees by Sales Amount within Each Department
We wish to rate each worker according to the quantity of sales they make in their department. In order to award consecutive rankings without gaps in the event of a tie, we will employ DENSE_RANK.
SELECT
EMPLOYEE_ID,
FIRST_NAME,
LAST_NAME,
DEPARTMENT,
SALES_DATE,
SALES_AMOUNT,
DENSE_RANK() OVER (PARTITION BY DEPARTMENT ORDER BY SALES_AMOUNT DESC) AS RANK_IN_DEPARTMENT
FROM
EMPLOYEE_SALES
ORDER BY
DEPARTMENT, RANK_IN_DEPARTMENT;
Output:
EMPLOYEE_ID FIRST_NAME LAST_NAME DEPARTMENT SALES_DATE SALES_AMOUNT RANK_IN_DEPARTMENT
----------- ---------- ---------- ------------ ---------- ------------ ------------------
108 Anna King Books 2023-02-15 750.00 1
107 Chris Black Books 2023-01-25 600.00 2
105 David Green Clothing 2023-02-05 950.00 1
103 Peter Jones Clothing 2023-01-10 800.00 2
106 Susan White Electronics 2023-02-10 1600.00 1
102 Jane Smith Electronics 2023-01-20 1500.00 2
104 Mary Brown Electronics 2023-02-01 1300.00 3
101 John Doe Electronics 2023-01-15 1200.00 4
The sales rank for each employee within their respective department is displayed here by RANK_IN_DEPARTMENT
, once more without lowering the amount of rows from the original table.
Comparing Current Sales to Previous Sales
We wish to view the sales of each employee as well as the sales amount of the employee in the same department who made a sale right before them. This illustrates how the LAG
function retrieves information from a previous row in the window.
SELECT
EMPLOYEE_ID,
FIRST_NAME,
LAST_NAME,
DEPARTMENT,
SALES_DATE,
SALES_AMOUNT,
LAG(SALES_AMOUNT, 1, 0) OVER (PARTITION BY DEPARTMENT ORDER BY SALES_DATE) AS PREVIOUS_SALES_AMOUNT
FROM
EMPLOYEE_SALES
ORDER BY
DEPARTMENT, SALES_DATE;
Output:
EMPLOYEE_ID FIRST_NAME LAST_NAME DEPARTMENT SALES_DATE SALES_AMOUNT PREVIOUS_SALES_AMOUNT
----------- ---------- ---------- ------------ ---------- ------------ ---------------------
107 Chris Black Books 2023-01-25 600.00 0.00
108 Anna King Books 2023-02-15 750.00 600.00
103 Peter Jones Clothing 2023-01-10 800.00 0.00
105 David Green Clothing 2023-02-05 950.00 800.00
101 John Doe Electronics 2023-01-15 1200.00 0.00
102 Jane Smith Electronics 2023-01-20 1500.00 1200.00
104 Mary Brown Electronics 2023-02-01 1300.00 1500.00
106 Susan White Electronics 2023-02-10 1600.00 1300.00
SALES_DATE
determines PREVIOUS_SALES_AMOUNT
displays the sales amount from the row before the current row in the same department. Each department’s first sale returns 0
because there is no prior row.
Analytical functions offer detailed insights without aggregating the result set into a single summary record by performing row-level calculations on groups of rows. They simplify difficult reporting chores that would normally need multi-step queries or external processing by allowing sophisticated SQL analysis. Analytical operations in Oracle Database include statistics and ranking tools.