What Are The MIN Function in PostgreSQL

MIN Function in PostgreSQL

MIN is a common PostgreSQL aggregate function that calculates the minimal value from a set of input values. It supports inet, interval, money, oid, pg_lsn, tid, xid8, and arrays of numeric, text, date/time, and enum data types. MIN computes the minimum of only non-null input values, ignoring NULLs, and returns NULL only if all input values in the aggregation group are NULL. MIN can be used as a window function with an OVER clause in addition to serving as a GROUP BY aggregate function to return a single minimum value per group.

It calculates the minimum across a defined “window” of rows but does not compress the rows, returning the aggregated minimum for each row, which is handy for running minimums. In addition, PostgreSQL provides a FILTER clause with MIN and other aggregates to carefully filter rows before they are supplied to the aggregate function, preventing problematic inputs like zero values in denominators for other aggregates. Indexing, especially B-tree indexes, can improve MIN query performance for equality and range queries on sorted data, but for small datasets, a sequential scan may be cheaper.

Handling NULL Values

NULL is different from an empty string in PostgreSQL because it signifies unknown, missing, or irrelevant data. This is crucial because relational databases use three-valued predicates (true, false, unknown), where NULL is “unknown”. Most SQL operators return NULL when used with NULL inputs, therefore direct comparisons like NULL = NULL or a = NULL result in NULL (unknown), not true or false, and WHERE clauses will not pick rows meeting such circumstances. Special operators like IS NULL and IS NOT NULL check for NULL values. IS DISTINCT FROM and IS NOT DISTINCT FROM allow NULL comparisons.

MIN, MAX, SUM, and AVG ignore NULL values and only return NULL if all input values in the aggregation group are NULL. COUNT(*) returns the entire number of input rows (including those with NULLs in other columns), while COUNT(expression) returns the number of non-NULL expression values. PostgreSQL has various NULL-handling functions: Coalesce returns the first non-NULL argument from a list, and NULLIF returns NULL if its two arguments are equal, otherwise the first. When a SELECT subquery returns no rows, NULL is returned.

Aggregate vs. Window Function

Window functions and regular aggregate functions are MIN’s principal uses.

As a Standard Aggregate Function: MIN processes a series of rows and returns a single result row without an OVER clause. This is often used with the GROUP BY clause to find the lowest value in categories. Example: This query returns each department’s minimum salary: FIND department, minimum wage FROM workforce GROUP BY department.

As a Window Function: With an OVER clause, MIN can be a window function. It calculates the least value of the current row over a “window” or group of rows instead of collapsing the rows. Returns the total minimum value for each output row. division BY groups the dataset, while ORDER BY arranges rows within each division.

The OVER clause picks the window. By default, PostgreSQL window functions like MIN act as RESPECT NULLS, incorporating NULL values into the window’s content determination (although MIN itself ignores NULLs). This helps find a running minimum or compare each row’s value to the context minimum. An OVER clause can use any built-in or user-defined aggregate function, including MIN, as a window function.

The FILTER Clause

Per-aggregate filtering of rows is made possible by PostgreSQL’s robust FILTER clause. It was first included in version 9.4 and was most recently standardised in ANSI SQL. It defines a condition that determines which input rows are sent to a certain aggregate function prior to the computation of the aggregation. This sets it apart from the HAVING clause, which filters groups of rows after aggregation and usually contains aggregate functions, and the WHERE clause, which filters rows before any grouping or aggregation for the entire query.

One of the main advantages of the FILTER clause is that it offers a more succinct and frequently more effective substitute for CASE WHEN expressions in aggregate functions. With aggregates like array_agg, for instance, using CASE WHEN could result in undesirable NULL values being included in the output; this is avoided by the FILTER clause, which yields cleaner outputs. While other aggregates in the same query can operate on a different set of filtered or unfiltered rows, the FILTER clause works with all aggregate functions, including user-defined ones, guaranteeing that only rows that satisfy its condition contribute to the calculation of that particular aggregate.

Performance Considerations

PostgreSQL performance is a complex issue that is impacted by database design, query optimisation, and continuous maintenance. Fundamentally, PostgreSQL makes use of a cost-based query planner that estimates CPU and I/O expenses to produce execution plans with the lowest possible cost. The EXPLAIN command allows developers to examine these plans and comprehend the planner’s choices; EXPLAIN ANALYSE provides real-time measurements.

A basic optimisation method is indexing, and B-tree indexes (by default) work quite well for equality and range queries on a variety of data types. Although a sequential scan may still be more efficient for small datasets due to lesser overhead, indexes on frequently searched columns and foreign keys can greatly speed up data retrieval by enabling index scans instead of slower sequential searches.

Performance can be further improved for particular query patterns by using advanced indexing options like partial indexes, which index a subset of rows, and indexes on expressions, which index computed values. By minimising expensive access to the primary table data (“heap”), covering indexes and index-only scans can significantly increase query speed. Nevertheless, cost is introduced during INSERT and UPDATE operations when indexes are added. Current PostgreSQL versions also provide JIT compilation , which speeds up expression evaluation and tuple deforming and is especially useful for lengthy CPU-bound analytical queries, and Parallel Query, which enables queries to use multiple CPUs for faster processing of large datasets.

Carefully utilising the WHERE, ORDER BY, and LIMIT clauses as well as the FILTER clause with aggregate functions a more effective option than CASE WHEN are all part of optimising SQL queries. It can also be very important to understand the performance implications of Common Table Expressions (CTEs) and rewrite subqueries (such as NOT IN to LEFT JOIN or NOT EXISTS). By restricting data scans to only pertinent partitions, table partitioning especially when combined with constraint exclusion can significantly increase query speed for very large tables from the standpoint of database design. Selecting the right data type affects overall efficiency by striking a balance between extensibility and storage.

User-Defined Aggregates

In addition to the usual built-in aggregates like MIN, MAX, SUM, and AVG, PostgreSQL enables users to construct their own aggregate functions. The foundation of these user-defined aggregates is the idea of a “state value” that is modified with each input row. The establish AGGREGATE command, which specifies a number of essential elements, is used to establish a new aggregate.

User-defined aggregates can be written in SQL, PL/pgSQL, PL/Python, C, and PL/V8. They support function overloading and polymorphism, allowing one aggregate definition to service different input data types. Applying an OVER clause to any user-defined aggregate function makes it a window function, allowing for advanced data analysis like computations. PL/V8 aggregates can be 10 to 20 times quicker than SQL aggregates for computationally expensive mathematical procedures.

Conclusion

MIN is a versatile aggregate and window function in PostgreSQL that efficiently computes the smallest value across a dataset while ignoring NULLs. It is flexible since it supports numeric, text, date/time, enums, inet, interval, and array data types. The FILTER clause for exact row selection, window function for running minimums, and indexing optimisations like B-tree and partial indexes to speed up searches improve MIN in PostgreSQL. Query planning, indexing, JIT compilation, and parallel execution can optimise performance. Extensions to PostgreSQL let developers to create user-defined aggregates that function like MIN but handle custom logic and data types. These features make MIN a key tool for aggregation, advanced analytics, and efficient query design.

Page Content

Tutorials