阿⾥云mysql5.7窗⼝函数_窗⼝函数-云原⽣数仓
AnalyticDBMySQL-阿⾥云
AnalyticDB for MySQL⽀持以下窗⼝函数。
排序函数
(r - 1) / (n - 1)计算得出。其中r为RANK()计算的当前⾏排名, n为当前窗⼝分区内总的⾏数。
值函数
概述
窗⼝函数基于查询结果的⾏数据进⾏计算,窗⼝函数运⾏在HAVING⼦句之后、 ORDER BY⼦句之前。窗⼝函数需要特殊的关键字OVER ⼦句来指定窗⼝即触发⼀个窗⼝函数。
分析型数据库MySQL版⽀持三种类型的窗⼝函数:聚合函数、排序函数和值函数。
语法function over (partition by a order by b RANGE|ROWS BETWEEN start AND end)
窗⼝函数包含以下三个部分。
分区规范:⽤于将输⼊⾏分散到不同的分区中,过程和GROUP BY⼦句的分散过程相似。
排序规范:决定输⼊数据⾏在窗⼝函数中执⾏的顺序。
窗⼝区间:指定计算数据的窗⼝边界。
窗⼝区间⽀持RANGE、ROWS两种模式:
RANGE按照计算列值的范围进⾏定义。
ROWS按照计算列的⾏数进⾏范围定义。
RANGE、ROWS中可以使⽤BETWEEN start AND end指定边界可取值。BETWEEN start AND end取值为:
CURRENT ROW,当前⾏。
N PRECEDING,前n⾏。
UNBOUNDED PRECEDING,直到第1⾏。
N FOLLOWING,后n⾏。
UNBOUNDED FOLLOWING,直到最后1⾏。
例如,以下查询根据当前窗⼝的每⾏数据计算profit的部分总和。select year,country,profit,sum(profit) over (partition by country order by year ROWS BETWEEN UNBOUNDED PRECEDING and CURRENT ROW) as slidewindow from testwindow;
+------+---------+--------+-------------+
| year | country | profit | slidewindow |
mysql分表和分区+------+---------+--------+-------------+
| 2001 | USA | 50 | 50 |
| 2001 | USA | 1500 | 1550 |
| 2000 | India | 75 | 75 |
| 2000 | India | 75 | 150 |
| 2001 | India | 79 | 229 |
| 2000 | Finland | 1500 | 1500 |
| 2001 | Finland | 10 | 1510 |
⽽以下查询只能计算出profit的总和。select country,sum(profit) over (partition by country) from testwindow;
+---------+-----------------------------------------+
| country | sum(profit) OVER (PARTITION BY country) |
+---------+-----------------------------------------+
| India | 229 |
| India | 229 |
| India | 229 |
| USA | 1550 |
| USA | 1550 |
| Finland | 1510 |
| Finland | 1510 |
注意事项
边界值的取值有如下要求:
start不能为UNBOUNDED FOLLOWING,否则提⽰Window frame start cannot be UNBOUNDED FOLLOWING错误。
end不能为UNBOUNDED PRECEDING,否则提⽰Window frame end cannot be UNBOUNDED PRECEDING错误。
start为CURRENT ROW并且end为N PRECEDING时,将提⽰Window frame starting from CURRENT ROW cannot end with PRECEDING错误。
start为N FOLLOWING并且end为N PRECEDING时,将提⽰Window frame starting from FOLLOWING cannot end with PRECEDING错误。
start为N FOLLOWING并且end为CURRENT ROW,将提⽰Window frame starting from FOLLOWING cannot end with CURRENT ROW错误。
当模式为RANGE时:
start或者end为N PRECEDING时,将提⽰Window frame RANGE PRECEDING is only supported with UNBOUNDED错误。
start或者end为N FOLLOWING时,将提⽰Window frame RANGE FOLLOWING is only supported with UNBOUNDED错误。
准备⼯作
本⽂中的窗⼝函数均以testwindow表为测试数据。create table testwindow(year int, country varchar(20), product varchar(20), profit int) distributed by hash(year);insert into testwindow values (2000,'Finland','Computer',1500);
insert into testwindow values (2001,'Finland','Phone',10);
insert into testwindow values (2000,'India','Calculator',75);
insert into testwindow values (2000,'India','Calculator',75);
insert into testwindow values (2001,'India','Calculator',79);
insert into testwindow values (2001,'USA','Calculator',50);
mockingbird中文歌词
insert into testwindow values (2001,'USA','Computer',1500);SELECT * FROM testwindow;
+------+---------+------------+--------+
| year | country | product | profit |
+------+---------+------------+--------+
| 2000 | Finland | Computer | 1500 |
| 2001 | Finland | Phone | 10 |
| 2000 | India | Calculator | 75 |
| 2000 | India | Calculator | 75 |
| 2001 | India | Calculator | 79 |
| 2001 | USA | Calculator | 50 |
| 2001 | USA | Computer | 1500 |
聚合函数
所有OVER⼦句来作为窗⼝函数使⽤,聚合函数将基于当前滑动窗⼝内的数据⾏计算每⼀⾏数据。
例如,通过以下查询循环显⽰每个店员每天的订单额总和。SELECT clerk, orderdate, orderkey, totalprice,sum(totalprice) OVER (PARTITION BY clerk ORDER BY orderdate) AS rolling_sum FROM orders ORDER BY clerk, orderdate, orderkey
CUME_DISTCUME_DIST()命令说明:返回⼀组数值中每个值的累计分布。
返回结果:在窗⼝分区中对窗⼝进⾏排序后的数据集,包括当前⾏和当前⾏之前的数据⾏数。排序中任何关联值均会计算成相同的分布值。
返回值类型:DOUBLE。
⽰例: select year,country,product,profit,cume_dist() over (partition by country order by profit) as cume_dist from testwindow;
+------+---------+------------+--------+--------------------+
| year | country | product | profit | cume_dist |
+------+---------+------------+--------+--------------------+
| 2001 | USA | Calculator | 50 | 0.5 |
| 2001 | USA | Computer | 1500 | 1.0 |
| 2001 | Finland | Phone | 10 | 0.5 |
| 2000 | Finland | Computer | 1500 | 1.0 |
会返回什么?| 2000 | India | Calculator | 75 | 0.6666666666666666 |
| 2000 | India | Calculator | 75 | 0.6666666666666666 |
| 2001 | India | Calculator | 79 | 1.0 |
RANKRANK()命令说明:返回数据集中每个值的排名。
排名值是将当前⾏之前的⾏数加1,不包含当前⾏。因此,排序的关联值可能产⽣顺序上的空隙,⽽且这个排名会对每个窗⼝分区进⾏计算。
返回值类型:BIGINT。
⽰例: select year,country,product,profit,rank() over (partition by country order by profit) as rank from testwindow;
+------+---------+------------+--------+------+
| year | country | product | profit | rank |
+------+---------+------------+--------+------+
| 2001 | Finland | Phone | 10 | 1 |
| 2000 | Finland | Computer | 1500 | 2 |
| 2001 | USA | Calculator | 50 | 1 |
| 2001 | USA | Computer | 1500 | 2 |
| 2000 | India | Calculator | 75 | 1 |
| 2000 | India | Calculator | 75 | 1 |
| 2001 | India | Calculator | 79 | 3 |
DENSE_RANKDENSE_RANK()命令说明:返回⼀组数值中每个数值的排名。
DENSE_RANK()与RANK()功能相似,但是DENSE_RANK()关联值不会产⽣顺序上的空隙。
返回值类型:BIGINT。
⽰例: select year,country,product,profit,dense_rank() over (partition by country order by profit) as dense_rank from testwindow;
layui按钮点击事件+------+---------+------------+--------+------------+
| year | country | product | profit | dense_rank |
停止时间的勇者只能再活三天
+------+---------+------------+--------+------------+
| 2001 | Finland | Phone | 10 | 1 |
| 2000 | Finland | Computer | 1500 | 2 |
| 2001 | USA | Calculator | 50 | 1 |
| 2001 | USA | Computer | 1500 | 2 |
| 2000 | India | Calculator | 75 | 1 |
| 2000 | India | Calculator | 75 | 1 |
| 2001 | India | Calculator | 79 | 2 |
NTILENTILE(n)命令说明:将每个窗⼝分区的数据分散到桶号从1到n的n个桶中。
桶号值最多间隔1,如果窗⼝分区中的数据⾏数不能均匀地分散到每⼀个桶中,则剩余值将从第1个桶开始,每1个桶分1⾏数据。例如,有6⾏数据和4个桶, 最终桶号值为1 1 2 2 3 4。
返回值类型:BIGINT。
mysql菜鸟教程聚合函数⽰例: select year,country,product,profit,ntile(2) over (partition by country order by profit) as ntile2 from testwindow;
+------+---------+------------+--------+--------+
| year | country | product | profit | ntile2 |
+------+---------+------------+--------+--------+
| 2001 | USA | Calculator | 50 | 1 |
| 2001 | USA | Computer | 1500 | 2 |
| 2001 | Finland | Phone | 10 | 1 |
| 2000 | Finland | Computer | 1500 | 2 |
| 2000 | India | Calculator | 75 | 1 |
| 2000 | India | Calculator | 75 | 1 |
| 2001 | India | Calculator | 79 | 2 |
ROW_NUMBERROW_NUMBER()命令说明:根据⾏在窗⼝分区内的顺序,为每⾏数据返回⼀个唯⼀的有序⾏号,⾏号从1开始。
返回值类型:BIGINT。
⽰例: SELECT year, country, product, profit, ROW_NUMBER() OVER(PARTITION BY country) AS row_num1 FROM testwindow;
+------+---------+------------+--------+----------+
| year | country | product | profit | row_num1 |
+------+---------+------------+--------+----------+
| 2001 | USA | Calculator | 50 | 1 |
| 2001 | USA | Computer | 1500 | 2 |
| 2000 | India | Calculator | 75 | 1 |
| 2000 | India | Calculator | 75 | 2 |
| 2001 | India | Calculator | 79 | 3 |
| 2000 | Finland | Computer | 1500 | 1 |
| 2001 | Finland | Phone | 10 | 2 |
PERCENT_RANKPERCENT_RANK()命令说明:返回数据集中每个数据的排名百分⽐,其结果由(r - 1) / (n - 1)计算得出。其中,r为RANK()计算的当前⾏排名, n为当前窗⼝分区内总的⾏数。
返回值类型:DOUBLE。
⽰例: select year,country,product,profit,PERCENT_RANK() over (partition by country order by profit) as ntile3 from testwindow;
+------+---------+------------+--------+--------+
| year | country | product | profit | ntile3 |
+------+---------+------------+--------+--------+
| 2001 | Finland | Phone | 10 | 0.0 |
| 2000 | Finland | Computer | 1500 | 1.0 |
| 2001 | USA | Calculator | 50 | 0.0 |
| 2001 | USA | Computer | 1500 | 1.0 |
| 2000 | India | Calculator | 75 | 0.0 |
| 2000 | India | Calculator | 75 | 0.0 |
| 2001 | India | Calculator | 79 | 1.0 |
FIRST_VALUEFIRST_VALUE(x)命令说明:返回窗⼝分区第⼀⾏的值。
返回值类型:与输⼊参数类型相同。

版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系QQ:729038198,我们将在24小时内删除。