tbwshc

          RAC節(jié)點頻繁重啟出現(xiàn)ORA-29702

          數(shù)據(jù)庫的Oracle 10204 RAC for Windows出現(xiàn)頻繁節(jié)點重啟的問題。

           

           

          從告警日志看,當(dāng)前節(jié)點的重啟一般發(fā)生在節(jié)點剛啟動或關(guān)閉時:

          Thu May 03 17:22:45 2012
          cluster interconnect IPC tb version:Oracle 9i Winsock2 TCP/IP IPC
          IPC Vendor 0 proto 0
          Version 0.0
          PMON started with pid=2, OS id=1616
          DIAG started with pid=3, OS id=120
          PSP0 started with pid=4, OS id=6104
          LMON started with pid=5, OS id=3844
          LMD0 started with pid=6, OS id=6120
          LMS0 started with pid=7, OS id=3548
          LMS1 started with pid=8, OS id=5688
          LMS2 started with pid=9, OS id=3636
          LMS3 started with pid=10, OS id=3588
          MMAN started with pid=11, OS id=3168
          DBW0 started with pid=12, OS id=3208
          DBW1 started with pid=13, OS id=5784
          LGWR started with pid=14, OS id=6208
          CKPT started with pid=15, OS id=3100
          SMON started with pid=16, OS id=5948
          RECO started with pid=17, OS id=3748
          CJQ0 started with pid=18, OS id=7152
          MMON started with pid=19, OS id=4552
          MMNL started with pid=20, OS id=6940
          Thu May 03 17:22:46 2012
          lmon registered with NM - instance id 1 (internal mem no 0)
          Thu May 03 17:22:46 2012
          Reconfiguration started (old inc 0, new inc 8)
          List of nodes:
          0 1
          Global Resource Directory frozen
          * allocate domain 0, invalid = TRUE
          Communication channels reestablished
          Error: KGXGN aborts the instance (6)
          Thu May 03 17:22:51 2012
          Errors in file d:\oracle\product\10.2.0\admin\orcl\bdump\orcl1_lmon_3844.trc:
          ORA-29702: ???????????

          LMON: terminating instance due to error 29702
          Thu May 03 17:22:51 2012
          Errors in file d:\oracle\product\10.2.0\admin\orcl\bdump\orcl1_pmon_1616.trc:
          ORA-29702: ???????????

          Thu May 03 17:22:51 2012
          Errors in file d:\oracle\product\10.2.0\admin\orcl\bdump\orcl1_psp0_6104.trc:
          ORA-29702: ???????????

          Thu May 03 17:22:51 2012
          Errors in file d:\oracle\product\10.2.0\admin\orcl\bdump\orcl1_dbw0_3208.trc:
          ORA-29702: ???????????

          Thu May 03 17:22:51 2012
          Errors in file d:\oracle\product\10.2.0\admin\orcl\bdump\orcl1_mman_3168.trc:
          ORA-29702: ???????????

          Thu May 03 17:22:51 2012
          Errors in file d:\oracle\product\10.2.0\admin\orcl\bdump\orcl1_dbw1_5784.trc:
          ORA-29702: ???????????

          Thu May 03 17:22:51 2012
          Errors in file d:\oracle\product\10.2.0\admin\orcl\bdump\orcl1_ckpt_3100.trc:
          ORA-29702: ???????????

          Thu May 03 17:22:51 2012
          Errors in file d:\oracle\product\10.2.0\admin\orcl\bdump\orcl1_lgwr_6208.trc:
          ORA-29702: ???????????

          Thu May 03 17:22:52 2012
          Errors in file d:\oracle\product\10.2.0\admin\orcl\bdump\orcl1_reco_3748.trc:
          ORA-29702: ???????????

          Thu May 03 17:22:52 2012
          Errors in file d:\oracle\product\10.2.0\admin\orcl\bdump\orcl1_smon_5948.trc:
          ORA-29702: ???????????

          Thu May 03 17:22:52 2012
          Errors in file d:\oracle\product\10.2.0\admin\orcl\bdump\orcl1_lms1_5688.trc:
          ORA-29702: ???????????

          Thu May 03 17:22:52 2012
          Errors in file d:\oracle\product\10.2.0\admin\orcl\bdump\orcl1_lms0_3548.trc:
          ORA-29702: ???????????

          Instance terminated by LMON, pid = 3844

          而從CSSD日志文件中可以發(fā)現(xiàn)下面的信息:

          [ CSSD]2012-04-29 16:26:07.953 [7112] >TRACE: clssgmReconfigThread: completed for reconfig(13), with status(1)
          2012-04-30 09:07:04.718: [ OCROSD]utgdv:11:could not read reg value ocrmirrorconfig_loc os error=
          操作系統(tǒng)找不到已輸入的環(huán)境選項。

          2012-04-30 09:07:04.718: [ OCROSD]utgdv:11:could not read reg value ocrmirrorconfig_loc os error=操作系統(tǒng)找不到已輸入的環(huán)境選項。

          [ CSSD]2012-04-30 09:07:04.765 >USER: Copyright 2012, Oracle version 10.2.0.4.0
          [ CSSD]2012-04-30 09:07:04.765 >USER: CSS daemon log for node crct-oadb, number 1, in cluster crs
          [ CSSD]2012-04-30 09:07:04.765 [3780] >TRACE: clssscmain: local-only set to false
          [ clsdmt]Listening to (ADDRESS=(PROTOCOL=tcp)(HOST=127.0.0.1)(PORT=61180))
          [ CSSD]2012-04-30 09:07:04.781 [3780] >TRACE: clssnmReadNodeInfo: added node 1 (crct-oadb) to cluster
          [ CSSD]2012-04-30 09:07:04.781 [3780] >TRACE: clssnmReadNodeInfo: added node 2 (crct-oapt) to cluster
          [ CSSD]2012-04-30 09:07:04.828 [3724] >TRACE: clssnm_skgxninit: Compatible vendor clusterware not in use
          [ CSSD]2012-04-30 09:07:04.828 [3724] >TRACE: clssnm_skgxnmon: skgxn init failed
          [ CSSD]2012-04-30 09:07:04.843 [3780] >TRACE: clssnmNMInitialize: misscount set to (60)
          [ CSSD]2012-04-30 09:07:04.843 [3780] >TRACE: clssnmNMInitialize: Network heartbeat thresholds are: impending reconfig 30000 ms, reconfig start (misscount) 60000 ms
          [ CSSD]2012-04-30 09:07:04.843 [3780] >TRACE: clssnmDiskStateChange: state from 1 to 2 disk (0/\\.\votedsk1)
          [ CSSD]2012-04-30 09:07:04.843 [3112] >TRACE: clssnmvDPT: spawned for disk 0 (\\.\votedsk1)
          [ CSSD]2012-04-30 09:07:06.843 [3112] >TRACE: clssnmDiskStateChange: state from 2 to 4 disk (0/\\.\votedsk1)
          [ CSSD]2012-04-30 09:07:06.843 [4492] >TRACE: clssnmvKillBlockThread: spawned for disk 0 (\\.\votedsk1) initial sleep interval (1000)ms

          根據(jù)這些信息查詢,發(fā)現(xiàn)屬于10.2.0.4上的bug:10gR2/11gR1: Instances Abort With ORA-29702 When The Server is rebooted or shut down [ID 752399.1]。這個bug影響10.2.0.1到10.2.0.4以及11.1.0.6和11.1.0.7版本。

          Oracle給出的解決方案是修改操作系統(tǒng)啟動時調(diào)用的K96 link替換為K19 link。不過當(dāng)前版本是Windows環(huán)境,顯然這種解決方法并不適用。恐怕除了升級版本外,沒有什么太好的其他解決方法。

          將產(chǎn)品環(huán)境部署在Windows環(huán)境下的系統(tǒng)確實少見,而在Windows上部署RAC的就更是鳳毛麟角了,而大多數(shù)這樣部署的不只是對于Oracle不了解,連Windows和Linux的穩(wěn)定性的差別都不是很清楚,出現(xiàn)各種問題的幾率自然要大得多了。

           


          posted on 2012-09-10 14:30 chen11-1 閱讀(2567) 評論(0)  編輯  收藏

          主站蜘蛛池模板: 香河县| 宁河县| 林西县| 彭泽县| 长岭县| 淮安市| 阿城市| 威海市| 景洪市| 余庆县| 仙游县| 左云县| 东乌| 栾川县| 巴彦县| 五寨县| 新化县| 乌鲁木齐县| 卢湾区| 宁城县| 缙云县| 南木林县| 鹤庆县| 徐汇区| 郎溪县| 达日县| 阜新市| 阿拉善右旗| 江陵县| 万源市| 景东| 永兴县| 祁连县| 湖南省| 屯留县| 冀州市| 万年县| 华宁县| 北安市| 江达县| 凌云县|