您的位置:首页 > 数据库 > MySQL

MySQL实践-数据分组和过滤

2016-06-28 23:29 453 查看

数据的分组与过滤

MySQL允许对检索的数据进行分组计算和过滤,涉及的SELECT语句子句是GROUP BY 和 HAVING。

原始数据
SELECT * FROM products;


+---------+---------+----------------+------------+----------------------------------------------------------------+
| prod_id | vend_id | prod_name      | prod_price | prod_desc                                                      |
+---------+---------+----------------+------------+----------------------------------------------------------------+
| ANV01   |    1001 | .5 ton anvil   |       5.99 | .5 ton anvil, black, complete with handy hook                  |
| ANV02   |    1001 | 1 ton anvil    |       9.99 | 1 ton anvil, black, complete with handy hook and carrying case |
| ANV03   |    1001 | 2 ton anvil    |      14.99 | 2 ton anvil, black, complete with handy hook and carrying case |
| DTNTR   |    1003 | Detonator      |      13.00 | Detonator (plunger powered), fuses not included                |
| FB      |    1003 | Bird seed      |      10.00 | Large bag (suitable for road runners)                          |
| FC      |    1003 | Carrots        |       2.50 | Carrots (rabbit hunting season only)                           |
| FU1     |    1002 | Fuses          |       3.42 | 1 dozen, extra long                                            |
| JP1000  |    1005 | JetPack 1000   |      35.00 | JetPack 1000, intended for single use                          |
| JP2000  |    1005 | JetPack 2000   |      55.00 | JetPack 2000, multi-use                                        |
| OL1     |    1002 | Oil can        |       8.99 | Oil can, red                                                   |
| SAFE    |    1003 | Safe           |      50.00 | Safe with combination lock                                     |
| SLING   |    1003 | Sling          |       4.49 | Sling, one size fits all                                       |
| TNT1    |    1003 | TNT (1 stick)  |       2.50 | TNT, red, single stick                                         |
| TNT2    |    1003 | TNT (5 sticks) |      10.00 | TNT, red, pack of 10 sticks                                    |
+---------+---------+----------------+------------+----------------------------------------------------------------+
14 rows in set (0.00 sec)


创建分组

统计products表中每个供应商提供的商品数:

SELECT vend_id, COUNT(*) AS num_prods FROM products GROUP BY vend_id;


+---------+-----------+
| vend_id | num_prods |
+---------+-----------+
|    1001 |         3 |
|    1002 |         2 |
|    1003 |         7 |
|    1005 |         2 |
+---------+-----------+
4 rows in set (0.00 sec)


GROUP BY 子句指示mysql分组数据,然后对每个分组进行聚集计算。

GROUP BY 的规定:

- GROUP BY 子句可以包含任意数目的列。使得能够对分组进行嵌套,为数据分组提供更细致的控制。

eg:
SELECT  vend_id,prod_price, COUNT(*) AS num_prods FROM products GROUP BY vend_id,prod_price;


+---------+------------+-----------+
| vend_id | prod_price | num_prods |
+---------+------------+-----------+
|    1001 |       5.99 |         1 |
|    1001 |       9.99 |         1 |
|    1001 |      14.99 |         1 |
|    1002 |       3.42 |         1 |
|    1002 |       8.99 |         1 |
|    1003 |       2.50 |         2 |
|    1003 |       4.49 |         1 |
|    1003 |      10.00 |         2 |
|    1003 |      13.00 |         1 |
|    1003 |      50.00 |         1 |
|    1005 |      35.00 |         1 |
|    1005 |      55.00 |         1 |
+---------+------------+-----------+
12 rows in set (0.00 sec)


如果在GROUP BY子句中嵌套了分组,数据将在最后规定的分组上进行汇总。

GROUP BY 语句中的每个列都必须是表中的检索列或者是有效的表达式(但不能是聚集函数)。如果在SELECT中使用了表达式,则必须在GROUP BY 子句中指定相同的表达式,不能使用别名。

除聚集计算子句外,SELECT 语句中的每个列都必须在GROUP BY 子句中给出。

如果分组列中有NULL值,则NULL将作为一个分组返回,多个NULL值视为一组。

GROUP BY 子句必须出现在WHERE子句之后,ORDER BY子句之前。

过滤分组

MySQL中利用HAVING子句来进行分组过滤,HAVING子句非常类似于WHERE子句,但是WHERE子句用来过滤行(记录),HAVING用来过滤分组,即先分组再过滤。例如我们想要从订单表中查询出所有至少包含两个订单的客户,就必须先将表中记录按照客户id进行分组,然后过滤出记录数大于等于2的那些分组,求出对应的客户id,针对分组的过滤就必须使用HAVING,如下:

mysql>  SELECT cust_id,COUNT(*) AS orders FROM orders GROUP BY cust_id HAVING COUNT(*) >= 2;
+---------+--------+
| cust_id | orders |
+---------+--------+
|   10001 |      2 |
+---------+--------+
1 row in set (0.01 sec)


HAVING
WHERE
子句有相似之处,但是又有着各自的不同

-
WHERE
在数据进行分组之前进行过滤,针对的是单条记录,
HAVING
在数据分组后进行过滤,针对的是分组

-
WHERE
在数据分组和聚集之前进行过滤,控制哪些记录进入分组或聚集,因此
WHERE
后不能包含聚集函数,
HAVING
后总是包含聚集函数,因为往往需要计算分组中哪些分组符合需求

- 绝大部分
WHERE
语句可以用
HAVING
替换,反之则不行,但是不提倡这么做,语义不清晰

-
WHERE
HAVING
可以同时使用,
WHERE
控制哪些记录参与分组,
HAVING
用来计算分组中的哪些组符合需求,如下:

mysql> SELECT vend_id,COUNT(*) AS num_prods FROM products WHERE prod_price >= 10 GROUP BY vend_id HAVING COUNT(*) >= 2;
+---------+-----------+
| vend_id | num_prods |
+---------+-----------+
|    1003 |         4 |
|    1005 |         2 |
+---------+-----------+
2 rows in set (0.00 sec)


满足条件
prod_price >=0
的记录进行分组,再筛选出记录数大于2的组

作为对比,不使用
WHERE
筛选,对所有数据都进行分组,再帅选出记录数大于2的运行结果如下:

mysql> SELECT vend_id,COUNT(*) AS num_prods FROM products  GROUP BY vend_id HAVING COUNT(*) >= 2;
+---------+-----------+
| vend_id | num_prods |
+---------+-----------+
|    1001 |         3 |
|    1002 |         2 |
|    1003 |         7 |
|    1005 |         2 |
+---------+-----------+
4 rows in set (0.00 sec)
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签:  数据库 MySQL