OVER(PARTITION BY)函數用法

1861338.html

OVER(PARTITION BY)函數介紹

開窗函數
Oracle從8.1.6開始提供分析函數，分析函數用于計算基于組的某種聚合值，它和聚合函數的不同之處是：對于每個組返回多行，而聚合函數對于每個組只返回一行。

      開窗函數指定了分析函數工作的數據窗口大小，這個數據窗口大小可能會隨著行的變化而變化，舉例如下：
1：over后的寫法：
   over（order by salary）按照salary排序進行累計，order by是個默認的開窗函數
   over（partition by deptno）按照部門分區

over（partition by deptno order by salary）

2：開窗的窗口范圍：
over（order by salary range between 5 preceding and 5 following）：窗口范圍為當前行數據幅度減5加5后的范圍內的。

舉例：

--sum(s)over(order by s range between 2 preceding and 2 following) 表示加2或2的范圍內的求和

select name,class,s, sum(s)over(order by s range between 2 preceding and 2 following) mm from t2
adf        3        45        45 --45加2減2即43到47，但是s在這個范圍內只有45
asdf       3        55        55
cfe        2        74        74
3dd        3        78        158 --78在76到80范圍內有78，80，求和得158
fda        1        80        158
gds        2        92        92
ffd        1        95        190
dss        1        95        190
ddd        3        99        198

gf 3 99 198

over（order by salary rows between 5 preceding and 5 following）：窗口范圍為當前行前后各移動5行。

舉例：

--sum(s)over(order by s rows between 2 preceding and 2 following)表示在上下兩行之間的范圍內
select name,class,s, sum(s)over(order by s rows between 2 preceding and 2 following) mm from t2
adf        3        45        174 （45+55+74=174）
asdf       3        55        252   （45+55+74+78=252）
cfe        2        74        332    （74+55+45+78+80=332）
3dd        3        78        379    （78+74+55+80+92=379）
fda        1        80        419
gds        2        92        440
ffd        1        95        461
dss        1        95        480
ddd        3        99        388
gf         3        99        293

over（order by salary range between unbounded preceding and unbounded following）或者

over（order by salary rows between unbounded preceding and unbounded following）：窗口不做限制

3、與over函數結合的幾個函數介紹

row_number()over()、rank()over()和dense_rank()over()函數的使用

下面以班級成績表t2來說明其應用

t2表信息如下：
cfe        2        74
dss        1        95
ffd        1        95
fda        1        80
gds        2        92
gf         3        99
ddd        3        99
adf        3        45
asdf       3        55
3dd        3        78

select * from
    (
    select name,class,s,rank()over(partition by class order by s desc) mm from t2
    )
    where mm=1；
得到的結果是:
dss        1        95        1
ffd        1        95        1
gds        2        92        1
gf         3        99        1
ddd        3        99        1

注意：
    1.在求第一名成績的時候，不能用row_number()，因為如果同班有兩個并列第一，row_number()只返回一個結果;
select * from
    (
    select name,class,s,row_number()over(partition by class order by s desc) mm from t2
    )
    where mm=1；
1        95        1 --95有兩名但是只顯示一個
2        92        1
3        99        1 --99有兩名但也只顯示一個

    2.rank()和dense_rank()可以將所有的都查找出來：
如上可以看到采用rank可以將并列第一名的都查找出來；
     rank()和dense_rank()區別：
     --rank()是跳躍排序，有兩個第二名時接下來就是第四名；
select name,class,s,rank()over(partition by class order by s desc) mm from t2
dss        1        95        1
ffd        1        95        1
fda        1        80        3 --直接就跳到了第三
gds        2        92        1
cfe        2        74        2
gf         3        99        1
ddd        3        99        1
3dd        3        78        3
asdf       3        55        4
adf        3        45        5
     --dense_rank()l是連續排序，有兩個第二名時仍然跟著第三名
select name,class,s,dense_rank()over(partition by class order by s desc) mm from t2
dss        1        95        1
ffd        1        95        1
fda        1        80        2 --連續排序（仍為2）
gds        2        92        1
cfe        2        74        2
gf         3        99        1
ddd        3        99        1
3dd        3        78        2
asdf       3        55        3
adf        3        45        4

--sum()over（）的使用
select name,class,s, sum(s)over(partition by class order by s desc) mm from t2 --根據班級進行分數求和
dss        1        95        190 --由于兩個95都是第一名，所以累加時是兩個第一名的相加
ffd        1        95        190
fda        1        80        270 --第一名加上第二名的
gds        2        92        92
cfe        2        74        166
gf         3        99        198
ddd        3        99        198
3dd        3        78        276
asdf       3        55        331
adf        3        45        376

first_value() over()和last_value() over()的使用

--找出這三條電路每條電路的第一條記錄類型和最后一條記錄類型

SELECT opr_id,res_type,
first_value(res_type) over(PARTITION BY opr_id ORDER BY res_type) low,
last_value(res_type) over(PARTITION BY opr_id ORDER BY res_type rows BETWEEN unbounded preceding AND unbounded following) high
FROM rm_circuit_route
WHERE opr_id IN ('000100190000000000021311','000100190000000000021355','000100190000000000021339')
ORDER BY opr_id;

注：rows BETWEEN unbounded preceding AND unbounded following 的使用

--取last_value時不使用rows BETWEEN unbounded preceding AND unbounded following的結果

SELECT opr_id,res_type,
first_value(res_type) over(PARTITION BY opr_id ORDER BY res_type) low,
last_value(res_type) over(PARTITION BY opr_id ORDER BY res_type) high
FROM rm_circuit_route
WHERE opr_id IN ('000100190000000000021311','000100190000000000021355','000100190000000000021339')
ORDER BY opr_id;

如下圖可以看到，如果不使用

rows BETWEEN unbounded preceding AND unbounded following，取出的last_value由于與res_type進行進行排列，因此取出的電路的最后一行記錄的類型就不是按照電路的范圍提取了，而是以res_type為范圍進行提取了。

在first_value和last_value中ignore nulls的使用

數據如下：

取出該電路的第一條記錄，加上ignore nulls后，如果第一條是判斷的那個字段是空的，則默認取下一條，結果如下所示：

--lag() over()函數用法（取出前n行數據）
lag(expresstion,<offset>,<default>)
with a as
(select 1 id,'a' name from dual
union
select 2 id,'b' name from dual
union
select 3 id,'c' name from dual
union
select 4 id,'d' name from dual
union
select 5 id,'e' name from dual
)
select id,name,lag(id,1,'')over(order by name) from a;

--lead() over()函數用法（取出后N行數據）

lead(expresstion,<offset>,<default>)
with a as
(select 1 id,'a' name from dual
union
select 2 id,'b' name from dual
union
select 3 id,'c' name from dual
union
select 4 id,'d' name from dual
union
select 5 id,'e' name from dual
)
select id,name,lead(id,1,'')over(order by name) from a;

--ratio_to_report(a)函數用法 Ratio_to_report() 括號中就是分子，over() 括號中就是分母
with a as (select 1 a from dual
           union all
select 1 a from dual
           union all
select 1 a from dual
           union all
select 2 a from dual
           union all
select 3 a from dual
           union all
select 4 a from dual
           union all
select 4 a from dual
           union all
select 5 a from dual
           )
select a, ratio_to_report(a)over(partition by a) b from a
order by a;

with a as (select 1 a from dual
           union all
select 1 a from dual
           union all
select 1 a from dual
           union all
select 2 a from dual
           union all
select 3 a from dual
           union all
select 4 a from dual
           union all
select 4 a from dual
           union all
select 5 a from dual
           )
select a, ratio_to_report(a)over() b from a --分母缺省就是整個占比
order by a;

with a as (select 1 a from dual
           union all
select 1 a from dual
           union all
select 1 a from dual
           union all
select 2 a from dual
           union all
select 3 a from dual
           union all
select 4 a from dual
           union all
select 4 a from dual
           union all
select 5 a from dual
           )
select a, ratio_to_report(a)over() b from a
group by a order by a;--分組后的占比

percent_rank用法

計算方法：所在組排名序號-1除以該組所有的行數-1，如下所示自己計算的pr1與通過percent_rank函數得到的值是一樣的：
SELECT a.deptno,
       a.ename,
       a.sal,
       a.r,
       b.n,
       (a.r-1)/(n-1) pr1,
       percent_rank() over(PARTITION BY a.deptno ORDER BY a.sal) pr2
FROM (SELECT deptno,
               ename,
               sal,
               rank() over(PARTITION BY deptno ORDER BY sal) r --計算出在組中的排名序號
          FROM emp
         ORDER BY deptno, sal) a,
       (SELECT deptno, COUNT(1) n FROM emp GROUP BY deptno) b --按部門計算每個部門的所有成員數
WHERE a.deptno = b.deptno;

cume_dist函數

計算方法：所在組排名序號除以該組所有的行數，但是如果存在并列情況，則需加上并列的個數-1，
          如下所示自己計算的pr1與通過percent_rank函數得到的值是一樣的：
SELECT a.deptno,
       a.ename,
       a.sal,
       a.r,
       b.n,
       c.rn,
       (a.r + c.rn - 1) / n pr1,
       cume_dist() over(PARTITION BY a.deptno ORDER BY a.sal) pr2
FROM (SELECT deptno,
               ename,
               sal,
               rank() over(PARTITION BY deptno ORDER BY sal) r
          FROM emp
         ORDER BY deptno, sal) a,
       (SELECT deptno, COUNT(1) n FROM emp GROUP BY deptno) b,
       (SELECT deptno, r, COUNT(1) rn,sal
          FROM (SELECT deptno,sal,
                       rank() over(PARTITION BY deptno ORDER BY sal) r
                  FROM emp)
         GROUP BY deptno, r,sal
         ORDER BY deptno) c --c表就是為了得到每個部門員工工資的一樣的個數
WHERE a.deptno = b.deptno
   AND a.deptno = c.deptno(+)
   AND a.sal = c.sal;

percentile_cont函數

含義：輸入一個百分比（該百分比就是按照percent_rank函數計算的值），返回該百分比位置的平均值
如下，輸入百分比為0.7，因為0.7介于0.6和0.8之間，因此返回的結果就是0.6對應的sal的1500加上0.8對應的sal的1600平均
SELECT ename,
       sal,
       deptno,
       percentile_cont(0.7) within GROUP(ORDER BY sal) over(PARTITION BY deptno) "Percentile_Cont",
       percent_rank() over(PARTITION BY deptno ORDER BY sal) "Percent_Rank"
FROM emp
WHERE deptno IN (30, 60);

若輸入的百分比為0.6，則直接0.6對應的sal值，即1500
SELECT ename,
       sal,
       deptno,
       percentile_cont(0.6) within GROUP(ORDER BY sal) over(PARTITION BY deptno) "Percentile_Cont",
       percent_rank() over(PARTITION BY deptno ORDER BY sal) "Percent_Rank"
FROM emp
WHERE deptno IN (30, 60);

PERCENTILE_DISC函數

功能描述：返回一個與輸入的分布百分比值相對應的數據值，分布百分比的計算方法見函數CUME_DIST，如果沒有正好對應的數據值，就取大于該分布值的下一個值。
注意：本函數與PERCENTILE_CONT的區別在找不到對應的分布值時返回的替代值的計算方法不同

SAMPLE：下例中0.7的分布值在部門30中沒有對應的Cume_Dist值，所以就取下一個分布值0.83333333所對應的SALARY來替代

SELECT ename,
       sal,
       deptno,
       percentile_disc(0.7) within GROUP(ORDER BY sal) over(PARTITION BY deptno) "Percentile_Disc",
       cume_dist() over(PARTITION BY deptno ORDER BY sal) "Cume_Dist"
FROM emp
WHERE deptno IN (30, 60);

posted on 2013-07-31 12:23 心硯閱讀(324) 評論(0) 編輯收藏所屬分類: DB

OVER(PARTITION BY)函數用法

轉自：

http://www.cnblogs.com/lanzi/archive/2010/10/26/1861338.html

OVER(PARTITION BY)函數介紹

導航

統計

常用鏈接

留言簿(2)

隨筆分類

文章分類

文章檔案

Forum

搜索

最新評論

閱讀排行榜

評論排行榜

OVER(PARTITION BY)函數用法

轉自： http://www.cnblogs.com/lanzi/archive/2010/10/26/1861338.htmlOVER(PARTITION BY)函數介紹

導航

統計

常用鏈接

留言簿(2)

隨筆分類

文章分類

文章檔案

Forum

搜索

最新評論

閱讀排行榜

評論排行榜

轉自：

http://www.cnblogs.com/lanzi/archive/2010/10/26/1861338.html

OVER(PARTITION BY)函數介紹