Page Content

Tutorials

What Are The AVG Function in PostgreSQL With Example

AVG Function in PostgreSQL

AVG is a built-in aggregate function in PostgreSQL that calculates the arithmetic mean (average) of data. AVG updates its internal state value with each input row. This stage usually has a running sum and count. One state transition function is called for each non-NULL tuple (row) value in the aggregation group, adding it to the running sum and increasing the count. After processing all relevant rows, an optional final function divides the running sum by the total count to calculate the final result.

AVG Function in PostgreSQL
AVG Function in PostgreSQL

Basic Usage as an Aggregate Function

AVG, a common aggregate, works with several input rows and yields a single result. AVG(expression) is its fundamental syntax. For instance, you would use SELECT AVG(age) FROM individuals; to determine the average age from a table called individuals. Similarly, SELECT avg(score) FROM tests; can be used to determine the average score from an exam table.

When handling NULL values, the AVG function disregards them throughout computation. The outcome will also be NULL if all of the AVG’s input values are NULL.

Numerous numeric and interval data types, such as smallint, int, bigint, real, double precision, numeric, or interval, can be used with AVG. The returned average’s data type is dependent on the input; if the argument is of the integer type, it will be numeric; if it is of the floating-point type, it will be double precision; otherwise, it will be the same data type as the argument.

Usage with GROUP BY

PostgreSQL’s built-in aggregate function AVG is often used with the GROUP BY clause to calculate the arithmetic mean of values for different rows.

The GROUP BY clause in a SELECT statement separates the input set of data into several groups based on the common values in the grouping columns or expressions. The AVG function calculates an average value for each newly formed group, delivering one result row per group. To retrieve the average score for each course number, use SELECT c_no, count(*), count(DISTINCT s_id), avg(score) FROM exams GROUP BY c_no;. Grouping by multiple columns in PostgreSQL groups rows with the same values in all columns. Grouping places NULL values in the grouping columns together.

When a table is grouped, the SELECT list can only include aggregation functions and expressions explicitly declared in the GROUP BY clause. Other columns cannot be used since they cannot have a single, consistent value for each group. The AVG function, like other aggregates, ignores NULL values and returns NULL only if all input values in a group are NULL. Without a GROUP BY clause, a query using AVG (or any aggregate function) implicitly forms a single group and calculates the average of all rows in the result set. PostgreSQL allows grouping by SELECT columns or other value expressions.

Usage with HAVING

PostgreSQL’s built-in aggregate function AVG calculates the arithmetic mean of row values. Combining it with the HAVING clause to filter groups by calculated average is powerful.

WHERE and HAVING are fundamentally different: the WHERE clause filters input rows before groups and aggregates are generated, controlling which rows are aggregated. The HAVING clause filters group rows after computing groups and aggregates like AVG. This means aggregate functions are not allowed in the WHERE clause but are necessary and legitimate in the HAVING clause to condition aggregated results.

To locate automobile models with an average age lower than the total average, use AVG in a subquery referred by the HAVING clause. Selecting states with high friend average ages is another option. For example, you can filter groups whose average screen resolution (AVG(CAST(browser->’resolution’->>’x’ AS integer)) satisfies particular requirements using the HAVING clause.

A GROUP BY clause is often used with HAVING. After GROUP BY groups data by columns, AVG (or other aggregate functions) produces a value for each group. To choose which groups to include in the final output, the HAVING clause compares a boolean expression to these group-level aggregate values. A query with aggregate function calls but no GROUP BY clause implicitly creates a group row, which the HAVING clause filters. While a HAVING clause without aggregates is conceivable, it’s rarely effective because the WHERE clause can apply similar conditions more efficiently.

Usage as a Window Function

Another method for calculating across related table rows is to use PostgreSQL’s window functions. In contrast to a standard aggregate, AVG does not gather rows into a single output row when it is used as a window function. Instead, it calculates over a “window” of rows that are linked to the current row. Rather, each row maintains its unique identity, and the average is shown next to each row’s specific data.

An OVER clause after the function name and arguments syntactically distinguishes a window function call. The calculation’s window of rows is specified by the OVER clause.

PARTITION BY: This OVER subclause separates the rows into “partitions,” or logical groups, according to the values of the provided expression or expressions. Each partition’s AVG function is then calculated separately. For instance, the average wage for each department is determined by avg(salary) OVER (PARTITION BY depname). The window encompasses every table row if PARTITION BY is left blank.

ORDER BY: This specifies the arrangement of rows in each partition and affects the “window frame” when it is a part of the OVER clause. AVG can generate a running average using ORDER BY and the default window frame configuration. This average is determined by the provided order, starting from the beginning of the partition and ending with the current row.

WINDOW clause: Window definitions might be lengthy for intricate queries. The WINDOW clause of the SELECT statement in PostgreSQL enables the definition and naming of windows. These windows can then be reused by other window functions, potentially improving query execution by creating partitions just once.

FILTER clause: Added to PostgreSQL 9.4, the FILTER clause helps limit the number of rows that are included in the aggregation depending on a criterion. It can also be used as window functions. For conditional aggregation, this can be a shorter and possibly faster option than CASE WHEN expressions.

Code Example:

CREATE TABLE employees (
    id SERIAL PRIMARY KEY,
    depname TEXT,
    salary INT,
    active BOOLEAN
);
INSERT INTO employees (depname, salary, active) VALUES
('HR', 4000, TRUE),
('HR', 4500, FALSE),
('IT', 5000, TRUE),
('IT', 5500, TRUE),
('Finance', 6000, TRUE),
('Finance', 6500, FALSE);

Output:

CREATE TABLE
INSERT 0 6

User-Defined Aggregate Functions

Users can enhance PostgreSQL’s capabilities beyond built-in aggregates like AVG with strong user-defined aggregate functions. Custom aggregates require knowledge of AVG’s internals.

The PostgreSQL built-in aggregate function AVG calculates the arithmetic mean. AVG updates its state value when each input row is processed. This state is usually the running sum and count of AVG values. The running sum and count are increased by calling a state transition function for each tuple (row) value in the aggregate group. Optionally, a final function is called to compute the return result from this state information; for AVG, it divides the running total by the count. AVG excludes NULL values from its calculation and returns NULL only if all inputs are NULL.

User-defined aggregate functions are created using CREATE AGGREGATE. You must specify:

SFUNC (state transition function): State transition function SFUNC modifies the aggregate’s internal state for each input value. Returns the future state from the current state and fresh input.

STYPE (state data type): This data type stores the aggregate’s running state. For aggregates like AVG that need sum and count, STYPE is generally an array.

FINALFUNC (final function): This optional function calculates the final result from the accumulated state. Omitted returns the final state value directly.

INITCOND (initial condition): Initial condition (INIT) sets the state’s initial value. Missing INITCOND suggests a NULL initial state, which can affect NULL input handling.

For instance, a custom complex number type sum aggregate would require an SFUNC (e.g., complex_add), complex as the STYPE, and ‘(0,0)’ as the INITCOND. A custom AVG-like aggregation requires an SFUNC to accumulate the sum and count, an array as STYPE, and a FINALFUNC to divide. The same polymorphic state transition or final functions can be used across aggregates or input types in PostgreSQL.

An OVER clause in the query lets user-defined aggregates, like built-in ones, serve as window functions. This lets aggregates calculate over a “window” of rows without collapsing them. Use the FILTER clause with aggregates (including user-defined ones) to conditionally include rows in the calculation to improve efficiency and readability. Implementation is flexible because aggregate functions can be written in SQL, PL/pgSQL, C, Python, and others.

Performance Considerations

Numerous aspects of query preparation, database configuration, and data organisation have a substantial impact on the AVG function’s speed in PostgreSQL. The success of PostgreSQL’s cost-based query planner, which seeks to provide the most efficient execution plan, is largely dependent on precise data gathered by the ANALYSE command; out-of-date statistics can result in less-than-ideal plans, particularly when it comes to join selectivity and row estimations.

Memory options like work_mem, which allots RAM for tasks like sorting and hash joins and avoids slower disc spills, can be changed to further optimise performance. Various aggregation algorithms, mostly sort-based or hash-based, are supported by PostgreSQL and are selected according to the projected number of distinct groups.

AVG procedures can be divided into two phases using modern PostgreSQL versions, which take use of parallel query execution: partial aggregate by several background workers and final aggregation by the master backend. This isn’t supported, though, for ordered-set aggregates or AVG calls that contain DISTINCT or ORDER BY clauses. By producing native code, Just-In-Time (JIT) compilation, which has been available since PostgreSQL 11, can speed up the evaluation of expressions, including aggregates, for CPU-bound queries.

DBeaver Support

DBeaver supports PostgreSQL aggregate methods like AVG. DBeaver, a robust and easy-to-use database management tool, offers various PostgreSQL AVG function options:

SQL Editor: DBeaver’s SQL editor lets you write and execute AVG-containing SQL queries. This contains complicated GROUP BY and HAVING queries to filter groups by estimated averages. In DBeaver, queries can calculate “Average screen resolution” using AVG(CAST(browser->’resolution’->>’x’ AS integer)). DBeaver supports all PostgreSQL versions, assuring SQL statement compatibility.

Visual Query Builder: DBeaver’s Visual Query Builder lets you build queries graphically. This application lets you handle selected data with AVG and other aggregate functions like COUNT, MAX, MIN, and SUM without writing SQL code. This simplifies aggregated queries.

Grouping Panel: DBeaver’s Data Editor Grouping Panel aggregates data. The default is COUNT, however you can apply different aggregation functions like AVG to your grouped results. It facilitates interactive data analysis and summarisation.

Charts Feature: From the SQL Editor, Data Editor, or Grouping Panel, DBeaver lets you chart SELECT queries, including AVG queries. This lets you convert query results into Bar, Line, and Pie charts to visualise averages.

The PostgreSQL built-in aggregate function AVG calculates the arithmetic mean of values. Null values are usually ignored in its calculations. With a GROUP BY clause, AVG calculates the average for each row group. The HAVING clause filters these groups depending on aggregate functions, such as groups with an average age over 25 or a positive estimated slope (REGR_SLOPE). Advanced aggregate queries are executed and displayed by DBeaver.

Kowsalya
Kowsalya
Hi, I'm Kowsalya a B.Com graduate and currently working as an Author at Govindhtech Solutions. I'm deeply passionate about publishing the latest tech news and tutorials that bringing insightful updates to readers. I enjoy creating step-by-step guides and making complex topics easier to understand for everyone.
Index