海鷗航際

          JAVA站
          posts - 11, comments - 53, trackbacks - 1, articles - 102
          在Oracle中如何利用Rowid查找和刪除表中的重復記錄

          平時工作中可能會遇到當試圖對庫表中的某一列或幾列創(chuàng)建唯一索引時,系統提示 ORA-01452 :不能創(chuàng)建唯一索引,發(fā)現重復記錄。

          下面總結一下幾種查找和刪除重復記錄的方法(以表CZ為例):
          表CZ的結構如下:
          SQL> desc cz
          Name Null? Type
          ----------------------------------------- -------- ------------------

          C1 NUMBER(10)
          C10 NUMBER(5)
          C20 VARCHAR2(3)

          刪除重復記錄的方法原理:
          (1).在Oracle中,每一條記錄都有一個rowid,rowid在整個數據庫中是唯一的,rowid確定了每條記錄是在Oracle中的哪一個數據文件、塊、行上。

          (2).在重復的記錄中,可能所有列的內容都相同,但rowid不會相同,所以只要確定出重復記錄中那些具有最大rowid的就可以了,其余全部刪除。

          重復記錄判斷的標準是:
          C1,C10和C20這三列的值都相同才算是重復記錄。

          經查看表CZ總共有16條記錄:
          SQL>set pagesize 100
          SQL>select * from cz;

          C1 C10 C20
          ---------- ---------- ---
          1 2 dsf
          1 2 dsf
          1 2 dsf
          1 2 dsf
          2 3 che
          1 2 dsf
          1 2 dsf
          1 2 dsf
          1 2 dsf
          2 3 che
          2 3 che
          2 3 che
          2 3 che
          3 4 dff
          3 4 dff
          3 4 dff
          4 5 err
          5 3 dar
          6 1 wee
          7 2 zxc

          20 rows selected.

          1.查找重復記錄的幾種方法:
          (1).SQL>select * from cz group by c1,c10,c20 having count(*) >1;
          C1 C10 C20
          ---------- ---------- ---
          1 2 dsf
          2 3 che
          3 4 dff

          (2).SQL>select distinct * from cz;

          C1 C10 C20
          ---------- ---------- ---
          1 2 dsf
          2 3 che
          3 4 dff

          (3).SQL>select * from cz a where rowid=(select max(rowid) from cz where c1=a.c1 and c10=a.c10 and c20=a.c20);
          C1 C10 C20
          ---------- ---------- ---
          1 2 dsf
          2 3 che
          3 4 dff

          2.刪除重復記錄的幾種方法:
          (1).適用于有大量重復記錄的情況(在C1,C10和C20列上建有索引的時候,用以下語句效率會很高):
          SQL>delete cz where (c1,c10,c20) in (select c1,c10,c20 from cz group by c1,c10,c20 having count(*)>1) and rowid not in
          (select min(rowid) from cz group by c1,c10,c20 having count(*)>1);

          SQL>delete cz where rowid not in(select min(rowid) from cz group by c1,c10,c20);

          (2).適用于有少量重復記錄的情況(注意,對于有大量重復記錄的情況,用以下語句效率會很低):
          SQL>delete from cz a where a.rowid!=(select max(rowid) from cz b where a.c1=b.c1 and a.c10=b.c10 and a.c20=b.c20);

          SQL>delete from cz a where a.rowid<(select max(rowid) from cz b where a.c1=b.c1 and a.c10=b.c10 and a.c20=b.c20);

          SQL>delete from cz a where rowid <(select max(rowid) from cz where c1=a.c1 and c10=a.c10 and c20=a.c20);

          (3).適用于有少量重復記錄的情況(臨時表法):
          SQL>create table test as select distinct * from cz; (建一個臨時表test用來存放重復的記錄)

          SQL>truncate table cz; (清空cz表的數據,但保留cz表的結構)

          SQL>insert into cz select * from test; (再將臨時表test里的內容反插回來)

          (4).適用于有大量重復記錄的情況(Exception into 子句法):
          采用alter table 命令中的 Exception into 子句也可以確定出庫表中重復的記錄。這種方法稍微麻煩一些,為了使用“excepeion into ”子句,必須首先創(chuàng)建 EXCEPTIONS 表。創(chuàng)建該表的 SQL 腳本文件為 utlexcpt.sql 。對于win2000系統和 UNIX 系統, Oracle 存放該文件的位置稍有不同,在win2000系統下,該腳本文件存放在$ORACLE_HOMEOra90rdbmsadmin 目錄下;而對于 UNIX 系統,該腳本文件存放在$ORACLE_HOME/rdbms/admin 目錄下。

          具體步驟如下:
          SQL>@?/rdbms/admin/utlexcpt.sql

          Table created.

          SQL>desc exceptions
          Name Null? Type
          ----------------------------------------- -------- --------------

          ROW_ID ROWID
          OWNER VARCHAR2(30)
          TABLE_NAME VARCHAR2(30)
          CONSTRAINT VARCHAR2(30)

          SQL>alter table cz add constraint cz_unique unique(c1,c10,c20) exceptions into exceptions;
          *
          ERROR at line 1:
          ORA-02299: cannot validate (TEST.CZ_UNIQUE) - duplicate keys found

          SQL>create table dups as select * from cz where rowid in (select row_id from exceptions);

          Table created.

          SQL>select * from dups;

          C1 C10 C20
          ---------- ---------- ---
          1 2 dsf
          1 2 dsf
          1 2 dsf
          1 2 dsf
          2 3 che
          1 2 dsf
          1 2 dsf
          1 2 dsf
          1 2 dsf
          2 3 che
          2 3 che
          2 3 che
          2 3 che
          3 4 dff
          3 4 dff
          3 4 dff

          16 rows selected.

          SQL>select row_id from exceptions;

          ROW_ID
          ------------------
          AAAHD/AAIAAAADSAAA
          AAAHD/AAIAAAADSAAB
          AAAHD/AAIAAAADSAAC
          AAAHD/AAIAAAADSAAF
          AAAHD/AAIAAAADSAAH
          AAAHD/AAIAAAADSAAI
          AAAHD/AAIAAAADSAAG
          AAAHD/AAIAAAADSAAD
          AAAHD/AAIAAAADSAAE
          AAAHD/AAIAAAADSAAJ
          AAAHD/AAIAAAADSAAK
          AAAHD/AAIAAAADSAAL
          AAAHD/AAIAAAADSAAM
          AAAHD/AAIAAAADSAAN
          AAAHD/AAIAAAADSAAO
          AAAHD/AAIAAAADSAAP

          16 rows selected.

          SQL>delete from cz where rowid in ( select row_id from exceptions);

          16 rows deleted.

          SQL>insert into cz select distinct * from dups;

          3 rows created.

          SQL>select *from cz;

          C1 C10 C20
          ---------- ---------- ---
          1 2 dsf
          2 3 che
          3 4 dff
          4 5 err
          5 3 dar
          6 1 wee
          7 2 zxc

          7 rows selected.

          從結果里可以看到重復記錄已經刪除。
          主站蜘蛛池模板: 灵石县| 共和县| 尼木县| 德保县| 铜川市| 钟山县| 建宁县| 和平区| 舟山市| 呼伦贝尔市| 湟中县| 精河县| 隆德县| 环江| 永清县| 遂平县| 沧州市| 容城县| 顺义区| 麻阳| 岳阳县| 温州市| 宁河县| 博罗县| 康平县| 定安县| 五寨县| 阳山县| 买车| 新乡县| 德江县| 辽阳市| 东源县| 砚山县| 交城县| 襄垣县| 阳城县| 辉南县| 杭锦旗| 武乡县| 左云县|