OVER(PARTITION BY)函數用法

          轉自:

          http://www.cnblogs.com/lanzi/archive/2010/10/26/1861338.html



          OVER(PARTITION BY)函數介紹

          開窗函數          
               Oracle從8.1.6開始提供分析函數,分析函數用于計算基于組的某種聚合值,它和聚合函數的不同之處是:對于每個組返回多行,而聚合函數對于每個組只返回一行。

                開窗函數指定了分析函數工作的數據窗口大小,這個數據窗口大小可能會隨著行的變化而變化,舉例如下:
          1:over后的寫法:    
             over(order by salary) 按照salary排序進行累計,order by是個默認的開窗函數
             over(partition by deptno)按照部門分區

           

             over(partition by deptno order by salary)

           

          2:開窗的窗口范圍
          over(order by salary range between 5 preceding and 5 following):窗口范圍為當前行數據幅度減5加5后的范圍內的。

          舉例:

           

          --sum(s)over(order by s range between 2 preceding and 2 following) 表示加2或2的范圍內的求和

           select name,class,s, sum(s)over(order by s range between 2 preceding and 2 following) mm from t2
          adf        3        45        45  --45加2減2即43到47,但是s在這個范圍內只有45
          asdf       3        55        55
          cfe        2        74        74
          3dd        3        78        158 --78在76到80范圍內有78,80,求和得158
          fda        1        80        158
          gds        2        92        92
          ffd        1        95        190
          dss        1        95        190
          ddd        3        99        198

          gf         3        99        198

           

           

           

          over(order by salary rows between 5 preceding and 5 following):窗口范圍為當前行前后各移動5行。

          舉例:

           

          --sum(s)over(order by s rows between 2 preceding and 2 following)表示在上下兩行之間的范圍內
          select name,class,s, sum(s)over(order by s rows between 2 preceding and 2 following) mm from t2
          adf        3        45        174  (45+55+74=174)
          asdf       3        55        252   (45+55+74+78=252)
          cfe        2        74        332    (74+55+45+78+80=332)
          3dd        3        78        379    (78+74+55+80+92=379)
          fda        1        80        419
          gds        2        92        440
          ffd        1        95        461
          dss        1        95        480
          ddd        3        99        388
          gf         3        99        293

           

           

          over(order by salary range between unbounded preceding and unbounded following)或者
          over(order by salary rows between unbounded preceding and unbounded following):窗口不做限制

           

          3、與over函數結合的幾個函數介紹

          row_number()over()、rank()over()和dense_rank()over()函數的使用

          下面以班級成績表t2來說明其應用

          t2表信息如下:
          cfe        2        74
          dss        1        95
          ffd        1        95
          fda        1        80
          gds        2        92
          gf         3        99
          ddd        3        99
          adf        3        45
          asdf       3        55
          3dd        3        78

          select * from                                                                      
              (                                                                           
              select name,class,s,rank()over(partition by class order by s desc) mm from t2
              )                                                                           
              where mm=1;
          得到的結果是:
          dss        1        95        1
          ffd        1        95        1
          gds        2        92        1
          gf         3        99        1
          ddd        3        99        1 

          注意:
              1.在求第一名成績的時候,不能用row_number(),因為如果同班有兩個并列第一,row_number()只返回一個結果;
          select * from                                                                      
              (                                                                           
              select name,class,s,row_number()over(partition by class order by s desc) mm from t2
              )                                                                           
              where mm=1;
          1        95        1  --95有兩名但是只顯示一個
          2        92        1
          3        99        1 --99有兩名但也只顯示一個

              2.rank()和dense_rank()可以將所有的都查找出來:
          如上可以看到采用rank可以將并列第一名的都查找出來;
               rank()和dense_rank()區別:
               --rank()是跳躍排序,有兩個第二名時接下來就是第四名;
          select name,class,s,rank()over(partition by class order by s desc) mm from t2
          dss        1        95        1
          ffd        1        95        1
          fda        1        80        3 --直接就跳到了第三
          gds        2        92        1
          cfe        2        74        2
          gf         3        99        1
          ddd        3        99        1
          3dd        3        78        3
          asdf       3        55        4
          adf        3        45        5
               --dense_rank()l是連續排序,有兩個第二名時仍然跟著第三名
          select name,class,s,dense_rank()over(partition by class order by s desc) mm from t2
          dss        1        95        1
          ffd        1        95        1
          fda        1        80        2 --連續排序(仍為2)
          gds        2        92        1
          cfe        2        74        2
          gf         3        99        1
          ddd        3        99        1
          3dd        3        78        2
          asdf       3        55        3
          adf        3        45        4

          --sum()over()的使用
          select name,class,s, sum(s)over(partition by class order by s desc) mm from t2 --根據班級進行分數求和
          dss        1        95        190  --由于兩個95都是第一名,所以累加時是兩個第一名的相加
          ffd        1        95        190 
          fda        1        80        270  --第一名加上第二名的
          gds        2        92        92
          cfe        2        74        166
          gf         3        99        198
          ddd        3        99        198
          3dd        3        78        276
          asdf       3        55        331
          adf        3        45        376

          first_value() over()和last_value() over()的使用  



          --找出這三條電路每條電路的第一條記錄類型和最后一條記錄類型

          SELECT opr_id,res_type,
                 first_value(res_type) over(PARTITION BY opr_id ORDER BY res_type) low,
                 last_value(res_type) over(PARTITION BY opr_id ORDER BY res_type rows BETWEEN unbounded preceding AND unbounded following) high
            FROM rm_circuit_route
          WHERE opr_id IN ('000100190000000000021311','000100190000000000021355','000100190000000000021339')
           ORDER BY opr_id;

           

          注:rows BETWEEN unbounded preceding AND unbounded following 的使用

          --取last_value時不使用rows BETWEEN unbounded preceding AND unbounded following的結果

           

          SELECT opr_id,res_type,
                 first_value(res_type) over(PARTITION BY opr_id ORDER BY res_type) low,
                 last_value(res_type) over(PARTITION BY opr_id ORDER BY res_type) high
            FROM rm_circuit_route
           WHERE opr_id IN ('000100190000000000021311','000100190000000000021355','000100190000000000021339')
           ORDER BY opr_id;

          如下圖可以看到,如果不使用

          rows BETWEEN unbounded preceding AND unbounded following,取出的last_value由于與res_type進行進行排列,因此取出的電路的最后一行記錄的類型就不是按照電路的范圍提取了,而是以res_type為范圍進行提取了。

           

           

           

           

           

          在first_value和last_value中ignore nulls的使用
          數據如下:

           

           

          取出該電路的第一條記錄,加上ignore nulls后,如果第一條是判斷的那個字段是空的,則默認取下一條,結果如下所示:

           

           

          --lag() over()函數用法(取出前n行數據)
          lag(expresstion,<offset>,<default>)
          with a as 
          (select 1 id,'a' name from dual
           union
           select 2 id,'b' name from dual
           union
           select 3 id,'c' name from dual
           union
           select 4 id,'d' name from dual
           union
           select 5 id,'e' name from dual
          ) 
          select id,name,lag(id,1,'')over(order by name) from a;

          --lead() over()函數用法(取出后N行數據)

          lead(expresstion,<offset>,<default>)
          with a as 
          (select 1 id,'a' name from dual
           union
           select 2 id,'b' name from dual
           union
           select 3 id,'c' name from dual
           union
           select 4 id,'d' name from dual
           union
           select 5 id,'e' name from dual
          ) 
          select id,name,lead(id,1,'')over(order by name) from a;

          --ratio_to_report(a)函數用法 Ratio_to_report() 括號中就是分子,over() 括號中就是分母
          with a as (select 1 a from dual
                     union all
          select 1 a from dual
                     union  all
          select 1 a from dual
                     union all
          select 2 a from dual
                     union all 
          select 3 a from dual
                     union all
          select 4 a from dual
                     union all
          select 4 a from dual
                     union all
          select 5 a from dual
                     )
          select a, ratio_to_report(a)over(partition by a) b from a 
          order by a; 

          with a as (select 1 a from dual
                     union all
          select 1 a from dual
                     union  all
          select 1 a from dual
                     union all
          select 2 a from dual
                     union all 
          select 3 a from dual
                     union all
          select 4 a from dual
                     union all
          select 4 a from dual
                     union all
          select 5 a from dual
                     )
          select a, ratio_to_report(a)over() b from a --分母缺省就是整個占比
          order by a; 

          with a as (select 1 a from dual
                     union all
          select 1 a from dual
                     union  all
          select 1 a from dual
                     union all
          select 2 a from dual
                     union all 
          select 3 a from dual
                     union all
          select 4 a from dual
                     union all
          select 4 a from dual
                     union all
          select 5 a from dual
                     )
          select a, ratio_to_report(a)over() b from a
          group by a order by a;--分組后的占比

           

          percent_rank用法
          計算方法:所在組排名序號-1除以該組所有的行數-1,如下所示自己計算的pr1與通過percent_rank函數得到的值是一樣的:
          SELECT a.deptno,
                 a.ename,
                 a.sal,
                 a.r,
                 b.n,
                 (a.r-1)/(n-1) pr1,
                 percent_rank() over(PARTITION BY a.deptno ORDER BY a.sal) pr2
            FROM (SELECT deptno,
                         ename,
                         sal,
                         rank() over(PARTITION BY deptno ORDER BY sal) r --計算出在組中的排名序號
                    FROM emp
                   ORDER BY deptno, sal) a,
                 (SELECT deptno, COUNT(1) n FROM emp GROUP BY deptno) b --按部門計算每個部門的所有成員數
           WHERE a.deptno = b.deptno;


           

          cume_dist函數

          計算方法:所在組排名序號除以該組所有的行數,但是如果存在并列情況,則需加上并列的個數-1,
                    如下所示自己計算的pr1與通過percent_rank函數得到的值是一樣的:
          SELECT a.deptno,
                 a.ename,
                 a.sal,
                 a.r,
                 b.n,
                 c.rn,
                 (a.r + c.rn - 1) / n pr1,
                 cume_dist() over(PARTITION BY a.deptno ORDER BY a.sal) pr2
            FROM (SELECT deptno,
                         ename,
                         sal,
                         rank() over(PARTITION BY deptno ORDER BY sal) r
                    FROM emp
                   ORDER BY deptno, sal) a,
                 (SELECT deptno, COUNT(1) n FROM emp GROUP BY deptno) b,
                 (SELECT deptno, r, COUNT(1) rn,sal
                    FROM (SELECT deptno,sal,
                                 rank() over(PARTITION BY deptno ORDER BY sal) r
                            FROM emp)
                   GROUP BY deptno, r,sal
                   ORDER BY deptno) c --c表就是為了得到每個部門員工工資的一樣的個數
           WHERE a.deptno = b.deptno
             AND a.deptno = c.deptno(+)
             AND a.sal = c.sal;
           

           
          percentile_cont函數

          含義:輸入一個百分比(該百分比就是按照percent_rank函數計算的值),返回該百分比位置的平均值
          如下,輸入百分比為0.7,因為0.7介于0.6和0.8之間,因此返回的結果就是0.6對應的sal的1500加上0.8對應的sal的1600平均
          SELECT ename,
                 sal,
                 deptno,
                 percentile_cont(0.7) within GROUP(ORDER BY sal) over(PARTITION BY deptno) "Percentile_Cont",
                 percent_rank() over(PARTITION BY deptno ORDER BY sal) "Percent_Rank"
            FROM emp
           WHERE deptno IN (30, 60);

           

          若輸入的百分比為0.6,則直接0.6對應的sal值,即1500
          SELECT ename,
                 sal,
                 deptno,
                 percentile_cont(0.6) within GROUP(ORDER BY sal) over(PARTITION BY deptno) "Percentile_Cont",
                 percent_rank() over(PARTITION BY deptno ORDER BY sal) "Percent_Rank"
            FROM emp
           WHERE deptno IN (30, 60);
           

          PERCENTILE_DISC函數

          功能描述:返回一個與輸入的分布百分比值相對應的數據值,分布百分比的計算方法見函數CUME_DIST,如果沒有正好對應的數據值,就取大于該分布值的下一個值。
          注意:本函數與PERCENTILE_CONT的區別在找不到對應的分布值時返回的替代值的計算方法不同

          SAMPLE:下例中0.7的分布值在部門30中沒有對應的Cume_Dist值,所以就取下一個分布值0.83333333所對應的SALARY來替代

          SELECT ename,
                 sal,
                 deptno,
                 percentile_disc(0.7) within GROUP(ORDER BY sal) over(PARTITION BY deptno) "Percentile_Disc",
                 cume_dist() over(PARTITION BY deptno ORDER BY sal) "Cume_Dist"
            FROM emp
           WHERE deptno IN (30, 60);
           

           

          posted on 2013-07-31 12:23 心硯 閱讀(324) 評論(0)  編輯  收藏 所屬分類: DB

          <2025年7月>
          293012345
          6789101112
          13141516171819
          20212223242526
          272829303112
          3456789

          導航

          統計

          常用鏈接

          留言簿(2)

          隨筆分類

          文章分類

          文章檔案

          Forum

          搜索

          最新評論

          閱讀排行榜

          評論排行榜

          主站蜘蛛池模板: 邛崃市| 荥阳市| 浙江省| 建瓯市| 无锡市| 安岳县| 连平县| 东乌珠穆沁旗| 南华县| 连江县| 木兰县| 宝山区| 乐陵市| 凤台县| 伊金霍洛旗| 苍山县| 东港市| 开封市| 抚顺市| 特克斯县| 册亨县| 郴州市| 汝州市| 东光县| 翁源县| 和林格尔县| 五峰| 玉环县| 玛多县| 陈巴尔虎旗| 沾益县| 延津县| 平昌县| 景宁| 古田县| 深圳市| 华坪县| 布拖县| 汉阴县| 南江县| 信宜市|