隨筆-314  評論-209  文章-0  trackbacks-0

          平時工作中可能會遇到當試圖對庫表中的某一列或幾列創建唯一索引時,系統提示 ORA-01452 :不能創建唯一索引,發現重復記錄。

          下面總結一下幾種查找和刪除重復記錄的方法(以表CZ為例):
          表CZ的結構如下:
          SQL> desc cz
          Name Null? Type
          ----------------------------------------- -------- ------------------

          C1 NUMBER(10)
          C10 NUMBER(5)
          C20 VARCHAR2(3)

          刪除重復記錄的方法原理:
          (1).在Oracle中,每一條記錄都有一個rowid,rowid在整個數據庫中是唯一的,rowid確定了每條記錄是在Oracle中的哪一個數據文件、塊、行上。

          (2).在重復的記錄中,可能所有列的內容都相同,但rowid不會相同,所以只要確定出重復記錄中那些具有最大rowid的就可以了,其余全部刪除。

          重復記錄判斷的標準是:
          C1,C10和C20這三列的值都相同才算是重復記錄。

          經查看表CZ總共有16條記錄:
          SQL>set pagesize 100
          SQL>select * from cz;

          C1 C10 C20
          ---------- ---------- ---
          1 2 dsf
          1 2 dsf
          1 2 dsf
          1 2 dsf
          2 3 che
          1 2 dsf
          1 2 dsf
          1 2 dsf
          1 2 dsf
          2 3 che
          2 3 che
          2 3 che
          2 3 che
          3 4 dff
          3 4 dff
          3 4 dff
          4 5 err
          5 3 dar
          6 1 wee
          7 2 zxc

          20 rows selected.

          1.查找重復記錄的幾種方法:
          (1).SQL>select * from cz group by c1,c10,c20 having count(*) >1;
          C1 C10 C20
          ---------- ---------- ---
          1 2 dsf
          2 3 che
          3 4 dff

          (2).SQL>select distinct * from cz;C1 C10 C20
          ---------- ---------- ---
          1 2 dsf
          2 3 che
          3 4 dff

          (3).SQL>select * from cz a where rowid=(select max(rowid) from cz where c1=a.c1 and c10=a.c10 and c20=a.c20);
          C1 C10 C20
          ---------- ---------- ---
          1 2 dsf
          2 3 che
          3 4 dff

          2.刪除重復記錄的幾種方法:
          (1).適用于有大量重復記錄的情況(在C1,C10和C20列上建有索引的時候,用以下語句效率會很高):

          SQL>delete cz where (c1,c10,c20) in (select c1,c10,c20 from cz group by c1,c10,c20 having count(*)>1) and rowid not in
          (select min(rowid) from cz group by c1,c10,c20 having count(*)>1);

          SQL>delete cz where rowid not in(select min(rowid) from cz group by c1,c10,c20);

           

          (2).適用于有少量重復記錄的情況(注意,對于有大量重復記錄的情況,用以下語句效率會很低):

          SQL>delete from cz a where a.rowid!=(select max(rowid) from cz b where a.c1=b.c1 and a.c10=b.c10 and a.c20=b.c20);

          SQL>delete from cz a where a.rowid<(select max(rowid) from cz b where a.c1=b.c1 and a.c10=b.c10 and a.c20=b.c20);

          SQL>delete from cz a where rowid <(select max(rowid) from cz where c1=a.c1 and c10=a.c10 and c20=a.c20);

           

          (3).適用于有少量重復記錄的情況(臨時表法) --超級土的辦法
          SQL>create table test as select distinct * from cz; (建一個臨時表test用來存放重復的記錄)

          SQL>truncate table cz; (清空cz表的數據,但保留cz表的結構)

          SQL>insert into cz select * from test; (再將臨時表test里的內容反插回來)

           

          (4).適用于有大量重復記錄的情況(Exception into 子句法): --很有意思的一個辦法
          采用alter table 命令中的 Exception into 子句也可以確定出庫表中重復的記錄。這種方法稍微麻煩一些,為了使用“excepeion into ”子句,必須首先創建 EXCEPTIONS 表。創建該表的 SQL 腳本文件為 utlexcpt.sql 。對于win2000系統和 UNIX 系統, Oracle 存放該文件的位置稍有不同,在win2000系統下,該腳本文件存放在$ORACLE_HOME\Ora90\rdbms\admin 目錄下;而對于 UNIX 系統,該腳本文件存放在$ORACLE_HOME/rdbms/admin 目錄下。

          具體步驟如下:
          SQL>@?/rdbms/admin/utlexcpt.sql

          Table created.

          SQL>desc exceptions
          Name Null? Type
          ----------------------------------------- -------- --------------

          ROW_ID ROWID
          OWNER VARCHAR2(30)
          TABLE_NAME VARCHAR2(30)
          CONSTRAINT VARCHAR2(30)

          SQL>alter table cz add constraint cz_unique unique(c1,c10,c20) exceptions into exceptions;
          *
          ERROR at line 1:
          ORA-02299: cannot validate (TEST.CZ_UNIQUE) - duplicate keys found

          SQL>create table dups as select * from cz where rowid in (select row_id from exceptions);

          Table created.

          SQL>select * from dups;

          C1 C10 C20
          ---------- ---------- ---
          1 2 dsf
          1 2 dsf
          1 2 dsf
          1 2 dsf
          2 3 che
          1 2 dsf
          1 2 dsf
          1 2 dsf
          1 2 dsf
          2 3 che
          2 3 che
          2 3 che
          2 3 che
          3 4 dff
          3 4 dff
          3 4 dff

          16 rows selected.

          SQL>select row_id from exceptions;

          ROW_ID
          ------------------
          AAAHD/AAIAAAADSAAA
          AAAHD/AAIAAAADSAAB
          AAAHD/AAIAAAADSAAC
          AAAHD/AAIAAAADSAAF
          AAAHD/AAIAAAADSAAH
          AAAHD/AAIAAAADSAAI
          AAAHD/AAIAAAADSAAG
          AAAHD/AAIAAAADSAAD
          AAAHD/AAIAAAADSAAE
          AAAHD/AAIAAAADSAAJ
          AAAHD/AAIAAAADSAAK
          AAAHD/AAIAAAADSAAL
          AAAHD/AAIAAAADSAAM
          AAAHD/AAIAAAADSAAN
          AAAHD/AAIAAAADSAAO
          AAAHD/AAIAAAADSAAP

          16 rows selected.

          SQL>delete from cz where rowid in ( select row_id from exceptions);

          16 rows deleted.

          SQL>insert into cz select distinct * from dups;

          3 rows created.

          SQL>select *from cz;

          C1 C10 C20
          ---------- ---------- ---
          1 2 dsf
          2 3 che
          3 4 dff
          4 5 err
          5 3 dar
          6 1 wee
          7 2 zxc

          7 rows selected.

          從結果里可以看到重復記錄已經刪除。

          posted on 2010-03-06 12:03 xzc 閱讀(1026) 評論(8)  編輯  收藏 所屬分類: Oracle

          評論:
          # re: rowid 刪除重復記錄?。?! 2010-03-06 12:49 | xzc
          delete rpt_index_inst_anly_reports a
          where rowid <> (select max(rowid)
          from rpt_index_inst_anly_reports b
          where a.data_date = b.data_date
          and a.report_id = b.report_id
          and a.index_id = b.index_id
          and a.latn_id = b.latn_id
          and a.rowno = b.rowno
          and a.colno = b.colno
          and a.data_date = 200903
          and a.report_id = 9
          and a.latn_id = 1202);  回復  更多評論
            
          # re: rowid 刪除重復記錄?。。? 2010-03-06 12:49 | xzc
          select *
          from rpt_index_inst_mon
          where (index_id, acct_month, latn_id, business_id, dimm1, dimm2, dimm3, dimm4, dimm5, index_value) in
          (select index_id, acct_month, latn_id, business_id, dimm1, dimm2, dimm3, dimm4, dimm5, index_value
          from rpt_index_inst_mon
          where acct_month = 201002
          --and latn_id = 1200
          and index_id in (select index_id from TSM_CALC_GROUP_INDEX_MAP where calc_group_id in (1300, 1301, 1302))
          group by index_id, acct_month, latn_id, business_id, dimm1, dimm2, dimm3, dimm4, dimm5, index_value
          having count(*) > 1)
          and rowid not in
          (select min(rowid)
          from rpt_index_inst_mon
          where acct_month = 201002
          --and latn_id = 1200
          and index_id in (select index_id from TSM_CALC_GROUP_INDEX_MAP where calc_group_id in (1300, 1301, 1302))
          group by index_id, acct_month, latn_id, business_id, dimm1, dimm2, dimm3, dimm4, dimm5, index_value
          having count(*) > 1);  回復  更多評論
            
          # re: rowid 刪除重復記錄!??! 2010-03-06 12:49 | xzc
          select *
          from rpt_index_inst_mon a
          where rowid not in
          (select min(rowid)
          from rpt_index_inst_mon
          where acct_month = 201002
          and latn_id = 1200
          and index_id in (select index_id from TSM_CALC_GROUP_INDEX_MAP where calc_group_id in (1300, 1301, 1302))
          group by index_id, acct_month, latn_id, business_id, dimm1, dimm2, dimm3, dimm4, dimm5, index_value)
          and a.acct_month = 201002
          and a.latn_id = 1200
          and a.index_id in (select index_id from TSM_CALC_GROUP_INDEX_MAP where calc_group_id in (1300, 1301, 1302))
            回復  更多評論
            
          # re: rowid 刪除重復記錄!!! 2010-03-06 12:50 | xzc
          select *
          from rpt_index_inst_mon a
          where rowid = (select max(rowid)
          from rpt_index_inst_mon b
          where a.index_id = b.index_id
          and a.acct_month = b.acct_month
          and a.latn_id = b.latn_id
          and a.business_id = b.business_id
          and a.dimm1 = b.dimm1
          and a.dimm2 = b.dimm2
          and a.dimm3 = b.dimm3
          and a.dimm4 = b.dimm4
          and a.dimm5 = b.dimm5
          and a.index_value = b.index_value)
          and a.index_id in
          (select index_id
          from rpt_index_inst_mon
          where acct_month = 201002
          --and latn_id = 1200
          and index_id in (select index_id from TSM_CALC_GROUP_INDEX_MAP where calc_group_id in (1300, 1301, 1302))
          group by index_id, acct_month, latn_id, business_id, dimm1, dimm2, dimm3, dimm4, dimm5, index_value
          having count(*) > 1)
          and a.acct_month = 201002
          --and a.latn_id = 1200  回復  更多評論
            
          # re: rowid 刪除重復記錄?。?! 2010-05-13 15:39 | xzc
          delete from rpt_index_inst_anly_group a
          where index_inst_id <> (select min(index_inst_id)
          from rpt_index_inst_anly_group b
          where a.index_id = b.index_id
          and a.data_date = b.data_date
          and a.latn_id = b.latn_id
          and a.business_id = b.business_id
          and a.cust_group_id = b.cust_group_id
          and a.data_date = 201001
          and a.latn_id = 1100
          and a.index_id in (SELECT a.index_id
          FROM tsm_index_value a, tsm_report_index_map b
          WHERE b.index_id = a.index_id
          AND b.calc_mode = '0'
          AND b.report_id = 9));  回復  更多評論
            
          # re: rowid 刪除重復記錄?。?! 2011-01-20 17:27 | xzc
          --從導入的網元中取最新的記錄[去除重復].sql
          --方法1
          select *
          from infuser.inf_cc_ne a
          where cc_ne_id = (select max(cc_ne_id) from infuser.inf_cc_ne b where b.serv_id = a.serv_id);
          --方法2
          select * from infuser.inf_cc_ne a where cc_ne_id in (select max(cc_ne_id) from infuser.inf_cc_ne b group by b.serv_id);
          --方法3(不一定準)
          select *
          from infuser.inf_cc_ne a
          where rowid = (select max(rowid) from infuser.inf_cc_ne b where b.serv_id = a.serv_id);
          --方法4(不一定準,這個可能是效果最好的)
          select * from infuser.inf_cc_ne a where rowid in (select max(rowid) from infuser.inf_cc_ne b group by b.serv_id);  回復  更多評論
            
          # re: rowid 刪除重復記錄!??! 2011-04-25 15:33 | xzc
          剔除重復記錄
          delete from oth_quality_check_result_list
          where list_id not in (select min(a.list_id)
          from oth_quality_check_result_list a
          where a.task_id = @FWFNO@
          and a.rule_id = @RULEID@
          and a.lan_id = @LANID@
          group by a.column_1)  回復  更多評論
            
          # re: rowid 刪除重復記錄!?。? 2011-05-31 10:05 | xzc
          --通過acc_nbr分組,取最新時間的記錄。
          select count(*)
          from infocs.subs a, infocs.prod b
          where (a.acc_nbr, a.update_date) in (select acc_nbr, max(update_date) from infocs.subs group by acc_nbr)
          and a.subs_id = b.prod_id
          and b.prod_state = 'B';  回復  更多評論
            
          主站蜘蛛池模板: 定陶县| 榕江县| 柳林县| 汝州市| 广昌县| 南京市| 白水县| 绥中县| 郴州市| 阿合奇县| 政和县| 商城县| 夹江县| 宁陕县| 巍山| 五家渠市| 丰宁| 富顺县| 绵竹市| 阳信县| 德安县| 浮山县| 安多县| 三门峡市| 阜阳市| 静安区| 金塔县| 金寨县| 酒泉市| 竹溪县| 彭水| 衡南县| 浪卡子县| 大连市| 望城县| 龙山县| 九寨沟县| 澎湖县| 积石山| 大兴区| 浮山县|