海鷗航際

          JAVA站
          posts - 11, comments - 53, trackbacks - 1, articles - 102
          在Oracle中如何利用Rowid查找和刪除表中的重復記錄

          平時工作中可能會遇到當試圖對庫表中的某一列或幾列創建唯一索引時,系統提示 ORA-01452 :不能創建唯一索引,發現重復記錄。

          下面總結一下幾種查找和刪除重復記錄的方法(以表CZ為例):
          表CZ的結構如下:
          SQL> desc cz
          Name Null? Type
          ----------------------------------------- -------- ------------------

          C1 NUMBER(10)
          C10 NUMBER(5)
          C20 VARCHAR2(3)

          刪除重復記錄的方法原理:
          (1).在Oracle中,每一條記錄都有一個rowid,rowid在整個數據庫中是唯一的,rowid確定了每條記錄是在Oracle中的哪一個數據文件、塊、行上。

          (2).在重復的記錄中,可能所有列的內容都相同,但rowid不會相同,所以只要確定出重復記錄中那些具有最大rowid的就可以了,其余全部刪除。

          重復記錄判斷的標準是:
          C1,C10和C20這三列的值都相同才算是重復記錄。

          經查看表CZ總共有16條記錄:
          SQL>set pagesize 100
          SQL>select * from cz;

          C1 C10 C20
          ---------- ---------- ---
          1 2 dsf
          1 2 dsf
          1 2 dsf
          1 2 dsf
          2 3 che
          1 2 dsf
          1 2 dsf
          1 2 dsf
          1 2 dsf
          2 3 che
          2 3 che
          2 3 che
          2 3 che
          3 4 dff
          3 4 dff
          3 4 dff
          4 5 err
          5 3 dar
          6 1 wee
          7 2 zxc

          20 rows selected.

          1.查找重復記錄的幾種方法:
          (1).SQL>select * from cz group by c1,c10,c20 having count(*) >1;
          C1 C10 C20
          ---------- ---------- ---
          1 2 dsf
          2 3 che
          3 4 dff

          (2).SQL>select distinct * from cz;

          C1 C10 C20
          ---------- ---------- ---
          1 2 dsf
          2 3 che
          3 4 dff

          (3).SQL>select * from cz a where rowid=(select max(rowid) from cz where c1=a.c1 and c10=a.c10 and c20=a.c20);
          C1 C10 C20
          ---------- ---------- ---
          1 2 dsf
          2 3 che
          3 4 dff

          2.刪除重復記錄的幾種方法:
          (1).適用于有大量重復記錄的情況(在C1,C10和C20列上建有索引的時候,用以下語句效率會很高):
          SQL>delete cz where (c1,c10,c20) in (select c1,c10,c20 from cz group by c1,c10,c20 having count(*)>1) and rowid not in
          (select min(rowid) from cz group by c1,c10,c20 having count(*)>1);

          SQL>delete cz where rowid not in(select min(rowid) from cz group by c1,c10,c20);

          (2).適用于有少量重復記錄的情況(注意,對于有大量重復記錄的情況,用以下語句效率會很低):
          SQL>delete from cz a where a.rowid!=(select max(rowid) from cz b where a.c1=b.c1 and a.c10=b.c10 and a.c20=b.c20);

          SQL>delete from cz a where a.rowid<(select max(rowid) from cz b where a.c1=b.c1 and a.c10=b.c10 and a.c20=b.c20);

          SQL>delete from cz a where rowid <(select max(rowid) from cz where c1=a.c1 and c10=a.c10 and c20=a.c20);

          (3).適用于有少量重復記錄的情況(臨時表法):
          SQL>create table test as select distinct * from cz; (建一個臨時表test用來存放重復的記錄)

          SQL>truncate table cz; (清空cz表的數據,但保留cz表的結構)

          SQL>insert into cz select * from test; (再將臨時表test里的內容反插回來)

          (4).適用于有大量重復記錄的情況(Exception into 子句法):
          采用alter table 命令中的 Exception into 子句也可以確定出庫表中重復的記錄。這種方法稍微麻煩一些,為了使用“excepeion into ”子句,必須首先創建 EXCEPTIONS 表。創建該表的 SQL 腳本文件為 utlexcpt.sql 。對于win2000系統和 UNIX 系統, Oracle 存放該文件的位置稍有不同,在win2000系統下,該腳本文件存放在$ORACLE_HOMEOra90rdbmsadmin 目錄下;而對于 UNIX 系統,該腳本文件存放在$ORACLE_HOME/rdbms/admin 目錄下。

          具體步驟如下:
          SQL>@?/rdbms/admin/utlexcpt.sql

          Table created.

          SQL>desc exceptions
          Name Null? Type
          ----------------------------------------- -------- --------------

          ROW_ID ROWID
          OWNER VARCHAR2(30)
          TABLE_NAME VARCHAR2(30)
          CONSTRAINT VARCHAR2(30)

          SQL>alter table cz add constraint cz_unique unique(c1,c10,c20) exceptions into exceptions;
          *
          ERROR at line 1:
          ORA-02299: cannot validate (TEST.CZ_UNIQUE) - duplicate keys found

          SQL>create table dups as select * from cz where rowid in (select row_id from exceptions);

          Table created.

          SQL>select * from dups;

          C1 C10 C20
          ---------- ---------- ---
          1 2 dsf
          1 2 dsf
          1 2 dsf
          1 2 dsf
          2 3 che
          1 2 dsf
          1 2 dsf
          1 2 dsf
          1 2 dsf
          2 3 che
          2 3 che
          2 3 che
          2 3 che
          3 4 dff
          3 4 dff
          3 4 dff

          16 rows selected.

          SQL>select row_id from exceptions;

          ROW_ID
          ------------------
          AAAHD/AAIAAAADSAAA
          AAAHD/AAIAAAADSAAB
          AAAHD/AAIAAAADSAAC
          AAAHD/AAIAAAADSAAF
          AAAHD/AAIAAAADSAAH
          AAAHD/AAIAAAADSAAI
          AAAHD/AAIAAAADSAAG
          AAAHD/AAIAAAADSAAD
          AAAHD/AAIAAAADSAAE
          AAAHD/AAIAAAADSAAJ
          AAAHD/AAIAAAADSAAK
          AAAHD/AAIAAAADSAAL
          AAAHD/AAIAAAADSAAM
          AAAHD/AAIAAAADSAAN
          AAAHD/AAIAAAADSAAO
          AAAHD/AAIAAAADSAAP

          16 rows selected.

          SQL>delete from cz where rowid in ( select row_id from exceptions);

          16 rows deleted.

          SQL>insert into cz select distinct * from dups;

          3 rows created.

          SQL>select *from cz;

          C1 C10 C20
          ---------- ---------- ---
          1 2 dsf
          2 3 che
          3 4 dff
          4 5 err
          5 3 dar
          6 1 wee
          7 2 zxc

          7 rows selected.

          從結果里可以看到重復記錄已經刪除。
          主站蜘蛛池模板: 玉田县| 民乐县| 民勤县| 化隆| 乐安县| 临海市| 沙坪坝区| 岐山县| 丰顺县| 河池市| 阜宁县| 冀州市| 论坛| 赫章县| 邵阳市| 平湖市| 咸阳市| 繁峙县| 天等县| 西丰县| 尼玛县| 黎城县| 仪陇县| 洛川县| 左权县| 濉溪县| 福安市| 云梦县| 靖边县| 尖扎县| 维西| 印江| 紫云| 宕昌县| 临猗县| 会东县| 新蔡县| 成都市| 永安市| 江城| 乐昌市|