Aggregate Functions
When calculating many values and returning a single value, aggregate functions are utilized. They work across one or more groups, analyzing data for each. Typical aggregate functions consist of:
- COUNT(): The number of rows that satisfy a certain criterion is returned by the COUNT() function.
- SUM(): Provides a numeric column’s total sum.
- AVG(): Provides a numeric column’s average value.
- MAX(): The maximum value of the chosen column is returned by MAX().
- MIN(): The smallest number in the chosen column is returned by the MIN() function.
With the exception of COUNT(*), these functions typically disregard NULL values.
The CREATE TABLE command for employee_tbl would be:
CREATE TABLE employee_tbl (
id INT NOT NULL,
name VARCHAR(50) NOT NULL,
work_date DATE NOT NULL,
daily_typing_pages INT NOT NULL,
PRIMARY KEY (id, work_date)
);
Here are the INSERT statements for each row shown in your query result:
INSERT INTO employee_tbl (id, name, work_date, daily_typing_pages) VALUES (1, 'John', '2007-01-24', 250);
INSERT INTO employee_tbl (id, name, work_date, daily_typing_pages) VALUES (2, 'Ram', '2007-05-27', 220);
INSERT INTO employee_tbl (id, name, work_date, daily_typing_pages) VALUES (3, 'Jack', '2007-05-06', 170);
INSERT INTO employee_tbl (id, name, work_date, daily_typing_pages) VALUES (3, 'Jack', '2007-04-06', 100);
INSERT INTO employee_tbl (id, name, work_date, daily_typing_pages) VALUES (4, 'Jill', '2007-04-06', 220);
INSERT INTO employee_tbl (id, name, work_date, daily_typing_pages) VALUES (5, 'Zara', '2007-06-06', 300);
INSERT INTO employee_tbl (id, name, work_date, daily_typing_pages) VALUES (5, 'Zara', '2007-02-06', 350);
Let’s examine an example utilizing the name and daily_typing_pages columns of the employee_tbl table:
mysql> SELECT * FROM employee_tbl;
+------+------+------------+--------------------+
| id | name | work_date | daily_typing_pages |
+------+------+------------+--------------------+
| 1 | John | 2007-01-24 | 250 |
| 2 | Ram | 2007-05-27 | 220 |
| 3 | Jack | 2007-05-06 | 170 |
| 3 | Jack | 2007-04-06 | 100 |
| 4 | Jill | 2007-04-06 | 220 |
| 5 | Zara | 2007-06-06 | 300 |
| 5 | Zara | 2007-02-06 | 350 |
+------+------+------------+--------------------+
7 rows in set (0.00 sec)
- COUNT(): Use COUNT(*) to determine the total number of rows in the table.
- Use COUNT(name) to determine how many non-NULL values there are in a given column, such as name. All rows that meet the given criterion are counted by COUNT(*).
- SUM(): To determine how many pages each employee typed overall.
- AVG(): To determine how many pages on average each employee types.
- MAX(): To determine how many pages can be entered in a single entry.
- MIN(): To determine how few pages are entered in a single entry.
Arithmetic Operators
Arithmetic operators are used to perform mathematical calculations on numeric values. Common arithmetic operators include:
- + (Addition)
- – (Subtraction)
- * (Multiplication)
- / (Division)
- % or MOD (Modulo): Returns the remainder of a division.
- DIV (Integer Division): Returns the integer part of a division.
- – (arg): Changes the sign of the argument.
- ** (Exponentiation): Raises a number to a power.
Example of basic arithmetic:
SELECT 60 * 60 * 24 * 7; -- result: 604800 (number of seconds in a week)
Example using MOD:
SELECT MOD(13, 2); -- result: 1
Grouping Data (GROUP BY Clause)
Rows with identical values are grouped into summary rows using the GROUP BY clause. The result-set is frequently grouped by one or more columns using aggregate methods. Grouping enables the separation of data into distinct groups according to a shared characteristic. The FROM and WHERE clauses must come before the GROUP BY clause.
For instance, to determine how many pages each employee typed overall:
mysql> SELECT name, SUM(daily_typing_pages) FROM employee_tbl GROUP BY name;
+------+-------------------------+
| name | SUM(daily_typing_pages) |
+------+-------------------------+
| Jack | 270 |
| Jill | 220 |
| John | 250 |
| Ram | 220 |
| Zara | 650 |
+------+-------------------------+
5 rows in set (0.17 sec)
All records for the same name are grouped in this query using GROUP BY name, and the daily_typing_pages within each group are subjected to the SUM() function.
You would create the pet table using the CREATE TABLE statement:
CREATE TABLE pet (
name VARCHAR(20),
owner VARCHAR(20),
species VARCHAR(20),
sex CHAR(1),
birth DATE,
death DATE
);
Here are the INSERT statements for the sample data, combined into one statement:
INSERT INTO pet (name, owner, species, sex, birth, death) VALUES
('Fluffy', 'Harold', 'cat', 'f', '1993-02-04', NULL),
('Claws', 'Gwen', 'cat', 'm', '1994-03-17', NULL),
('Buffy', 'Harold', 'dog', 'f', '1989-05-13', NULL),
('Fang', 'Benny', 'dog', 'm', '1990-08-27', NULL),
('Bowser', 'Diane', 'dog', 'm', '1989-08-31', '1995-07-29'), -- Note: Corrected birth year to 1989 based on [38]
('Chirpy', 'Gwen', 'bird', 'f', '1998-09-11', NULL),
('Whistler', 'Gwen', 'bird', NULL, '1997-12-09', NULL), -- Note: Sex is NULL
('Slim', 'Benny', 'snake', 'm', '1996-04-29', NULL),
('Puffball', 'Diane', 'hamster', 'f', '1999-03-30', NULL);
Grouping by more than one column is possible. To count the number of animals in the pet table by species and sex, for example:
mysql> SELECT species, sex, COUNT(*) FROM pet GROUP BY species, sex;
+---------+------+----------+
| species | sex | COUNT(*) |
+---------+------+----------+
| bird | NULL | 1 |
| bird | f | 1 |
| cat | f | 1 |
| cat | m | 1 |
| dog | f | 1 |
| dog | m | 2 |
| hamster | f | 1 |
| snake | m | 1 |
+---------+------+----------+
The connection between the GROUP BY clause and the columns in the SELECT list should be noted. In general, the GROUP BY clause should contain the same columns that you name in the SELECT list together with an aggregate function. Failure to enable the ONLY_FULL_GROUP_BY SQL mode will result in an error.
Filtering Groups (HAVING Clause)
Data can be filtered using the HAVING clause according to aggregate values. Because aggregate functions cannot use the WHERE keyword, it was added to SQL. HAVING filters groups after grouping and aggregation, in contrast to WHERE, which filters rows prior to the aggregation.
To identify workers whose total number of typed pages exceeds 250, for instance:
mysql> SELECT name, SUM(daily_typing_pages) FROM employee_tbl GROUP BY name HAVING SUM(daily_typing_pages) > 250;
+------+-------------------------+
| name | SUM(daily_typing_pages) |
+------+-------------------------+
| Jack | 270 |
| Zara | 650 |
+------+-------------------------+
2 rows in set (0.17 sec)
Prior to the HAVING clause filtering these groups and retaining only those where the sum exceeds 250, the data is first grouped by name and the SUM() of daily_typing_pages is computed for each group.
WHERE, GROUP BY, and HAVING can all be combined. For example, using a hypothetical Sailors table, determine the average age of sailors with a rating of at least 8 for each rating level with at least two of these sailors:
SELECT S.rating, AVG ( S.age ) AS avgage
FROM Sailors S
WHERE S. age >= 8
GROUP BY S.rating
HAVING COUNT (*) > 1 ;
In order to preserve just those groups with more than one sailor (HAVING), this query first filters entries where age is less than 8 (WHERE), then groups by rating (GROUP BY).
To sum up, aggregate functions enable computations on data sets, GROUP BY enables you to specify those sets according to column values, and HAVING enables you to filter the outcome groups according to the aggregate functions’ output.