Page Content

Tutorials

Understanding HAVING Clause in PostgreSQL with Example

HAVING Clause in PostgreSQL

In PostgreSQL, the GROUP BY clause aggregates the data, and then the HAVING clause filters groups of rows according to a defined criterion. It differs from the WHERE clause primarily in that HAVING filters the results of those groups after aggregation has occurred, while WHERE filters individual rows prior to grouping and aggregation. As a result, you can apply conditions directly to aggregated data (e.g., HAVING count(*) > 5 or HAVING sum(p.price * s.units) > 5000), which is not feasible in a WHERE clause. This means that aggregate functions are allowed and frequently utilised within the HAVING clause.

Usually, in a SELECT statement, the HAVING clause comes after the GROUP BY clause. The logical order of PostgreSQL query processing is to take rows from tables, apply WHERE filters, then create groups using GROUP BY, and then apply HAVING conditions to these groups to remove those that don’t meet the criteria.

Although a HAVING clause is frequently used in conjunction with GROUP BY, it is technically possible for a query to contain one without explicit GROUP BY or aggregate functions. In these situations, the entire result set is regarded as a single group, and the HAVING condition is applied to that single group. When you need to limit output based on summary calculations, like when you need to identify car models with a total number of automobiles larger than five, this clause is crucial.

Key Distinction: HAVING vs. WHERE

Writing effective SQL queries in PostgreSQL requires an understanding of the distinction between HAVING and WHERE:

WHERE Clause: Used prior to any aggregate computations and the GROUP BY clause. Reduces the number of rows in the base table or the output of the JOIN and FROM clauses. Aggregate functions are not allowed. Usually, the FROM clause’s conditions employ columns straight from the tables. Because fewer rows need to be processed by GROUP BY and aggregate functions, it is more effective at filtering non-aggregated data.

HAVING Clause: Used following the GROUP BY clause and following the computation of each group’s results by aggregate functions. Filtering complete sets of rows is the scope. Its conditions usually include aggregate functions. Non-aggregate expressions are likewise acceptable as long as they are included in the GROUP BY clause. Efficiency is crucial for filtering based on results that have been pooled. Because WHERE filters rows earlier in the query processing pipeline, it is often more efficient when used for conditions that do not require aggregates, however it is still possible to include such criteria.

For instance, the state filter (state = ‘NY’, for instance) and the average age filter (AVG(age) > 30) would go in the WHERE clause and the HAVING clause, respectively, if you wanted to determine the average age of persons from a particular state and then only display states where that average age is above 30.

Examples:

Let’s examine a few real-world instances of the HAVING clause:

Counting records per group: Assuming you have a table of friends and wish to see whether states have many friends registered, you can count the records for each group. Friends are first grouped by state in this query. Only states where the COUNT(*) (number of friends) is larger than one are displayed once these groupings are filtered by the HAVING clause.

Filtering based on maximum value in a group: Weather table that has columns for city and temp_lo. City with the lowest recorded temperature below 40 degrees. Here, weather items are grouped by city, MAX(temp_lo) is calculated, and only cities with a maximum low temperature below 40 are retained by the HAVING clause.

Combining WHERE and HAVING: To filter data at different stages, use both WHERE and HAVING in the same query. In the last four weeks, identify car models with gross sales over 5,000. The WHERE clause in this instance initially limits the rows to sales that have occurred in the previous four weeks. These filtered rows are then aggregated using GROUP using, and the resulting groups are further filtered by HAVING according to their total gross sales.

Simpler alternative to subqueries: In certain cases, obtaining results that might otherwise call for a subquery can be accomplished more succinctly and readable by using the HAVING clause. For instance, to identify car models with more than five vehicles overall. This is frequently chosen over a nested subquery that computes counts first, followed by an outer query filtering them.

Code Example:

-- Create a sample table
CREATE TABLE people (
    id SERIAL PRIMARY KEY,
    name TEXT,
    state TEXT,
    age INT
);
INSERT INTO people (name, state, age) VALUES
('Alice', 'NY', 25),
('Bob', 'NY', 35),
('Charlie', 'NY', 40),
('David', 'CA', 28),
('Eva', 'CA', 32),
('Frank', 'TX', 45);
SELECT state, AVG(age) AS avg_age
FROM people
WHERE state IN ('NY', 'CA')      
GROUP BY state
HAVING AVG(age) > 30;             

Output:

CREATE TABLE
INSERT 0 6
state | avg_age
-------+---------------------
NY | 33.3333333333333333
(1 row)

Interaction with Other Clauses

GROUP BY: GROUP BY and the HAVING clause are inextricably intertwined. It is intended to work on the combined output that GROUP BY produces. PostgreSQL handles the complete result set as a single group for aggregation if a query has aggregate functions but no GROUP BY clause. Any HAVING clause will then apply to this single group.

ORDER BY: This clause usually follows HAVING and can be used to arrange the final result set, including sorting by aggregate values.

SELECT list: Grouped expressions or aggregate functions of non-grouped expressions may be referenced by columns or expressions in the condition of the HAVING clause. SELECT lists frequently include aggregation functions and grouping columns.

Conclusion

In conclusion, PostgreSQL HAVING clause is an effective tool for filtering aggregated or grouped data following calculations. HAVING applies conditions to complete groups, allowing the use of aggregate functions like COUNT, SUM, or AVG directly within its conditions, in contrast to the WHERE clause, which filters individual rows before grouping. Although it can function without the GROUP BY clause by treating the entire result as a single group, it works closely with it. HAVING makes difficult searches simpler, eliminates the need for subqueries, and offers effective, straightforward methods for deriving insights from aggregated data by providing exact filtering on summarised data.

Kowsalya
Kowsalya
Hi, I'm Kowsalya a B.Com graduate and currently working as an Author at Govindhtech Solutions. I'm deeply passionate about publishing the latest tech news and tutorials that bringing insightful updates to readers. I enjoy creating step-by-step guides and making complex topics easier to understand for everyone.
Index