Crs-5802 Unable To Start The Agent Process | How To Fix (Error-1053)The Service Did Not Respond To The Start Or Control Request || Windows 10/8/7 상위 288개 베스트 답변

당신은 주제를 찾고 있습니까 “crs-5802 unable to start the agent process – How To Fix (Error-1053)The Service Did Not Respond To The Start Or Control Request || Windows 10/8/7“? 다음 카테고리의 웹사이트 Chewathai27.com/you 에서 귀하의 모든 질문에 답변해 드립니다: Chewathai27.com/you/blog. 바로 아래에서 답을 찾을 수 있습니다. 작성자 MK TECH 이(가) 작성한 기사에는 조회수 110,874회 및 165613 Like 개의 좋아요가 있습니다.

Table of Contents

crs-5802 unable to start the agent process 주제에 대한 동영상 보기

여기에서 이 주제에 대한 비디오를 시청하십시오. 주의 깊게 살펴보고 읽고 있는 내용에 대한 피드백을 제공하세요!

d여기에서 How To Fix (Error-1053)The Service Did Not Respond To The Start Or Control Request || Windows 10/8/7 – crs-5802 unable to start the agent process 주제에 대한 세부정보를 참조하세요

Hii Friends Welcome Back My Channel MK Tech…………………………..

In This Tutorial How To Fix (Error-1053)The Service Did Not Respond To The Start Or Control Request || Windows 10/8/7……………….

Follow My Tutorial Steps Properly………………………………………

Thanks For Watching……………………………………………………………….

,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,

crs-5802 unable to start the agent process 주제에 대한 자세한 내용은 여기를 참조하세요.

SRVCTL: CRS-2678, CRS-0267, CRS-5802: Unable to start …

CRS-5802: Unable to start the agent process. But during that time we were able to startup database using sqlplus:.

+ 여기를 클릭

Source: dba010.com

Date Published: 2/13/2022

View: 2761

How to solve CRS-5802: Unable to start the agent process

Today, We faced above issue where srvctl start database was failing on one node. It was failing with “CRS-5802: Unable to start the agent process” .

+ 더 읽기

Source: www.oracledbworld.com

Date Published: 3/15/2021

View: 1994

While starting Oracle Database using srvctl, database startup …

CRS-5802: Unable to start the agent process. CRS-2680: Clean of ‘ora.prod.db’ on ‘node1’ failed CRS-5802: Unable to start the agent process.

+ 여기에 자세히 보기

Source: askmedawaa.wordpress.com

Date Published: 8/27/2022

View: 4397

CRS-5802: Unable to start the agent process – 墨天轮

CRS-5802: Unable to start the agent process `ora.crm.db` on member `crmdb1` has experienced an unrecoverable failure.

+ 여기에 표시

Source: www.modb.pro

Date Published: 4/1/2022

View: 5439

Starting Database Instance Using srvctl Fails With Errors …

CRS-0267: Human intervention required to resume its availability. CRS-5802: Unable to start the agent process oracle.ops.opsctl.StopAction.

+ 여기에 더 보기

Source: www.askmlabs.com

Date Published: 10/22/2022

View: 1861

Crs issue commands – SlideShare

Fails with errors. PRCR-1013 : Failed to start resources ora.racdb1. … srvctl start database’ Fails With ‘CRS-5802: Unable to start the agent process’ as …

+ 여기에 표시

Source: www.slideshare.net

Date Published: 8/3/2021

View: 4655

‘srvctl start database’ Fails With PRCR-1013 PRCR-1064 CRS …

.db’ on ‘dbadm01’ failed. CRS-5802: Unable to start the agent process. CRS_alert.log will show errors like the following:.

+ 여기를 클릭

Source: 27.125.37.29

Date Published: 12/22/2022

View: 4909

How to Troubleshoot Grid Infrastructure Startup Issues

In a nutshell, the operating system starts ohasd, ohasd starts agents to start up … “CRS-5802: Unable to start the agent process” could show up as well.

+ 여기에 더 보기

Source: blog.itpub.net

Date Published: 8/19/2021

View: 3687

주제와 관련된 이미지 crs-5802 unable to start the agent process

주제와 관련된 더 많은 사진을 참조하십시오 How To Fix (Error-1053)The Service Did Not Respond To The Start Or Control Request || Windows 10/8/7. 댓글에서 더 많은 관련 이미지를 보거나 필요한 경우 더 많은 관련 기사를 볼 수 있습니다.

How To Fix (Error-1053)The Service Did Not Respond To The Start Or Control Request || Windows 10/8/7
How To Fix (Error-1053)The Service Did Not Respond To The Start Or Control Request || Windows 10/8/7

주제에 대한 기사 평가 crs-5802 unable to start the agent process

  • Author: MK TECH
  • Views: 조회수 110,874회
  • Likes: 165613 Like
  • Date Published: 2019. 12. 9.
  • Video Url link: https://www.youtube.com/watch?v=3AsnGlyfbiw

SRVCTL: CRS-2678, CRS-0267, CRS-5802: Unable to start the agent process

SRVCTL: CRS-2678, CRS-0267, CRS-5802: Unable to start the agent process

September 6, 2018

We had the following problem with some customer:

srvctl start database -db dbname was failing on one of the cluster nodes with the following error:

[oracle@node1 ~]$ srvctl start database -db dbname

PRCR-1079 : Failed to start resource ora.dbname.db

CRS-2674: Start of ‘ora.dbname.db’ on ‘rac1’ failed

CRS-2678: ‘ora.dbname.db’ on ‘rac1’ has experienced an unrecoverable failure

CRS-0267: Human intervention required to resume its availability.

CRS-5802: Unable to start the agent process

But during that time we were able to startup database using sqlplus:

[oracle@rac1 ~]$ sqlplus / as sysdba SQL> startup ORACLE instance started. Total System Global Area 1577058304 bytes

Fixed Size 8621136 bytes

Variable Size 805307312 bytes

Database Buffers 754974720 bytes

Redo Buffers 8155136 bytes

Database mounted.

Database opened.

It was strange and took a lot of time for me to troubleshoot this issue.

I tried many things:

* removed srvctl config using srvctl remove database -db orcl

* readded it again srvctl add database -db orcl

* readded instances

* also tried to restart crs and even the servers

but with no luck.

Then I found the following documentation Doc ID 1957360.1 on Oracle site and tried to reproduce the same problem on my lab servers and I did it.

I tried to change the ownership for the file on my test cluster on only one node:

[root@rac1 ~]# ll /u01/app/grid/crsdata/rac1/output/crsd_oraagent_oracleOUT.trc

-rw-r–r– 1 oracle oinstall 1085 Sep 5 20:17 /u01/app/grid/crsdata/rac1/output/crsd_oraagent_oracleOUT.trc

[root@rac1 ~]# ll /u01/app/grid/crsdata/rac1/output/crsd_oraagent_oracle.pid

-rw-r–r– 1 oracle oinstall 6 Sep 5 20:17 /u01/app/grid/crsdata/rac1/output/crsd_oraagent_oracle.pid

[root@rac1 ~]# chown root:root /u01/app/grid/crsdata/rac1/output/crsd_oraagent_oracle.pid

[root@rac1 ~]# chown root:root /u01/app/grid/crsdata/rac1/output/crsd_oraagent_oracleOUT.trc

I tried to startup instance using sqlplus and it was successful:

[oracle@rac1 ~]$ sqlplus / as sysdba SQL> startup ORACLE instance started

Database mounted.

Database opened.

Stopped the database and tried with srvctl :

After a long wait it failed:

[oracle@rac1 ~]$ srvctl start database -db orcl

PRCR-1079 : Failed to start resource ora.orcl.db

CRS-2674: Start of ‘ora.orcl.db’ on ‘rac1’ failed

CRS-2678: ‘ora.orcl.db’ on ‘rac1’ has experienced an unrecoverable failure

CRS-0267: Human intervention required to resume its availability.

CRS-5802: Unable to start the agent process

I also checked customer logs and found that files crsd_oraagent_oracle.pid, crsd_oraagent_oracleOUT.trc were not updated for a long time, they were older than other files.

So to solve such problem you need to assign correct owner, group and access permission for the above two files and you will be able to start database using srvctl.

[root@rac1 ~]# chown oracle:oinstall /u01/app/grid/crsdata/rac1/output/crsd_oraagent_oracle.pid

[root@rac1 ~]# chown oracle:oinstall /u01/app/grid/crsdata/rac1/output/crsd_oraagent_oracleOUT.trc

[root@rac1 ~]# chown 644 /u01/app/grid/crsdata/rac1/output/crsd_oraagent_oracle.pid

[root@rac1 ~]# chown 644 /u01/app/grid/crsdata/rac1/output/crsd_oraagent_oracleOUT.trc

You may never have such errors but if you have you know how to solve.

How to solve CRS-5802: Unable to start the agent process

Today, We faced above issue where srvctl start database was failing on one node. It was failing with “CRS-5802: Unable to start the agent process” . There was no clue in database alert log what was causing the problem.

Following issue was reported on Sun Solaris OS / 19.11 Binary Version

oracle@oracle_test:~$ srvctl start database -d oracledb PRCR-1079 : Failed to start resource ora.oracledb.db CRS-2674: Start of ‘ora.oracledb.db’ on ‘oracle_test2’ failed CRS-2678: ‘ora.oracledb.db’ on ‘oracle_test2’ has experienced an unrecoverable failure CRS-0267: Human intervention required to resume its availability. CRS-5802: Unable to start the agent process oracle@oracle_test:~$

If you check database’s status in cluster – It shows starting… but it never started for me.

oracle@oracle_test2:~$ /GRIDHOME/oracle/app/product/grid/19.3.0/bin/crsctl stat res -t ——————————————————————————– Name Target State Server State details ——————————————————————————– Local Resources ——————————————————————————– ora.LISTENER.lsnr ONLINE ONLINE oracle_test STABLE ONLINE ONLINE oracle_test2 STABLE ora.helper OFFLINE OFFLINE oracle_test STABLE OFFLINE OFFLINE oracle_test2 IDLE,STABLE ora.net1.network ONLINE ONLINE oracle_test STABLE ONLINE ONLINE oracle_test2 STABLE ora.ons ONLINE ONLINE oracle_test STABLE ONLINE ONLINE oracle_test2 STABLE ora.proxy_advm OFFLINE OFFLINE oracle_test STABLE OFFLINE OFFLINE oracle_test2 STABLE ——————————————————————————– Cluster Resources ——————————————————————————– ora.ARCH.dg(ora.asmgroup) 1 ONLINE ONLINE oracle_test2 STABLE 2 ONLINE ONLINE oracle_test STABLE ora.ASMNET1LSNR_ASM.lsnr(ora.asmgroup) 1 ONLINE ONLINE oracle_test2 STABLE 2 ONLINE ONLINE oracle_test STABLE ora.DATA.dg(ora.asmgroup) 1 ONLINE ONLINE oracle_test2 STABLE 2 ONLINE ONLINE oracle_test STABLE ora.LISTENER_SCAN1.lsnr 1 ONLINE ONLINE oracle_test2 STABLE ora.LISTENER_SCAN2.lsnr 1 ONLINE ONLINE oracle_test STABLE ora.LISTENER_SCAN3.lsnr 1 ONLINE ONLINE oracle_test STABLE ora.MGMTDB.dg(ora.asmgroup) 1 ONLINE ONLINE oracle_test2 STABLE 2 ONLINE ONLINE oracle_test STABLE ora.MGMTLSNR 1 ONLINE ONLINE oracle_test 169.254.25.89 10.254 .64.168,STABLE ora.OCR_VOTE.dg(ora.asmgroup) 1 ONLINE ONLINE oracle_test2 STABLE 2 ONLINE ONLINE oracle_test STABLE ora.REDO.dg(ora.asmgroup) 1 ONLINE ONLINE oracle_test2 STABLE 2 ONLINE ONLINE oracle_test STABLE ora.asm(ora.asmgroup) 1 ONLINE ONLINE oracle_test2 Started,STABLE 2 ONLINE ONLINE oracle_test Started,STABLE ora.asmnet1.asmnetwork(ora.asmgroup) 1 ONLINE ONLINE oracle_test2 STABLE 2 ONLINE ONLINE oracle_test STABLE ora.cvu 1 ONLINE ONLINE oracle_test STABLE ora.oracle_test.vip 1 ONLINE ONLINE oracle_test STABLE ora.oracle_test2.vip 1 ONLINE ONLINE oracle_test2 STABLE ora.mgmtdb 1 ONLINE ONLINE oracle_test Open,STABLE ora.qosmserver 1 ONLINE ONLINE oracle_test STABLE ora.rhpserver 1 OFFLINE OFFLINE STABLE ora.scan1.vip 1 ONLINE ONLINE oracle_test2 STABLE ora.scan2.vip 1 ONLINE ONLINE oracle_test STABLE ora.scan3.vip 1 ONLINE ONLINE oracle_test STABLE ora.oracledb.db 1 ONLINE ONLINE oracle_test Open,Readonly,HOME=/ DBHOME/oracle/app/pr oduct/19.3.0/dbhome_ 1,STABLE 2 ONLINE OFFLINE oracle_test2 STARTING ——————————————————————————– oracle@oracle_test2:~$ -bash-5.1#

After searching on MOS, came arcos – ‘srvctl start database’ Fails With ‘CRS-5802: Unable to start the agent process’ as the Agent Log is Owned by Wrong User (Doc ID 1957360.1) which suggest to check for ownership of following location –

$(orabase)/crsdata//output/ crsd_oraagent_OUT.trc $(orabase)/crsdata//output/crsd_oraagent_.pid

And looking at the node 2’s ownership was different then ownership of node 1.

-bash-5.1# ls -ltr total 1295 -rw-r–r– 1 grid oinstall 6 Apr 5 12:58 crsd_oraagent_oracle.pid -rw-r–r– 1 grid oinstall 1872 Apr 9 19:49 crsd_oraagent_oracleOUT.trc -rw-r–r– 1 grid oinstall 2637 Apr 12 14:09 crsd_scriptagent_gridOUT.trc -rw-r–r– 1 grid oinstall 2147 Apr 12 14:09 crsd_jagent_gridOUT.trc -rw-r–r– 1 grid oinstall 6 Apr 12 14:09 crsd_scriptagent_grid.pid -rw-r–r– 1 grid oinstall 6 Apr 12 14:09 crsd_jagent_grid.pid -rw-r–r– 1 grid oinstall 1695 Apr 12 14:13 ologgerdOUT.trc -rw-r–r– 1 grid oinstall 6 Apr 12 14:13 ologgerd.pid -rw-r–r– 1 grid oinstall 335 Apr 12 17:29 crswrapexece.log -rw-r–r– 1 grid oinstall 3758 Apr 12 17:29 ohasdOUT.trc -rw-r–r– 1 grid oinstall 5 Apr 12 17:29 ohasd.pid -rw-r–r– 1 grid oinstall 4629 Apr 12 17:29 ohasd_orarootagent_rootOUT.trc -rw-r–r– 1 grid oinstall 5 Apr 12 17:29 ohasd_orarootagent_root.pid -rw-r–r– 1 grid oinstall 7779 Apr 12 17:29 ohasd_oraagent_gridOUT.trc -rw-r–r– 1 grid oinstall 5 Apr 12 17:29 ohasd_oraagent_grid.pid -rw-r–r– 1 grid oinstall 184295 Apr 12 17:29 mdnsdOUT.trc -rw-r–r– 1 grid oinstall 3741 Apr 12 17:29 evmdOUT.trc -rw-r–r– 1 grid oinstall 5 Apr 12 17:29 mdnsd.pid -rw-r–r– 1 grid oinstall 5 Apr 12 17:29 evmd.pid -rw-r–r– 1 grid oinstall 5 Apr 12 17:29 gpnpd.pid -rw-r–r– 1 grid oinstall 4476 Apr 12 17:29 gpnpdOUT.trc -rw-r–r– 1 grid oinstall 5 Apr 12 17:29 evmlogger.pid -rw-r–r– 1 grid oinstall 5310 Apr 12 17:29 evmloggerOUT.trc -rw-r–r– 1 grid oinstall 5 Apr 12 17:29 gipcd.pid -rw-r–r– 1 grid oinstall 5005 Apr 12 17:29 gipcdOUT.trc -rw-r–r– 1 grid oinstall 14034 Apr 12 17:29 ohasd_cssdmonitor_rootOUT.trc -rw-r–r– 1 grid oinstall 5 Apr 12 17:29 ohasd_cssdmonitor_root.pid -rw-r–r– 1 grid oinstall 13974 Apr 12 17:29 ohasd_cssdagent_rootOUT.trc -rw-r–r– 1 grid oinstall 5 Apr 12 17:29 osysmond.pid -rw-r–r– 1 grid oinstall 5 Apr 12 17:29 ohasd_cssdagent_root.pid -rw-r–r– 1 grid oinstall 10858 Apr 12 17:29 osysmondOUT.trc -rw-r–r– 1 grid oinstall 5 Apr 12 17:29 ocssd.pid -rw-r–r– 1 grid oinstall 233292 Apr 12 17:29 ocssdOUT.trc -rw-r–r– 1 grid oinstall 3357 Apr 12 17:30 octssdOUT.trc -rw-r–r– 1 grid oinstall 5 Apr 12 17:30 octssd.pid -rw-r–r– 1 grid oinstall 2909 Apr 12 17:30 crsdOUT.trc -rw-r–r– 1 grid oinstall 5 Apr 12 17:30 crsd.pid -rw-r–r– 1 grid oinstall 3533 Apr 12 17:30 crsd_orarootagent_rootOUT.trc -rw-r–r– 1 grid oinstall 5 Apr 12 17:30 crsd_orarootagent_root.pid -rw-r–r– 1 grid oinstall 5997 Apr 12 17:33 crsd_oraagent_gridOUT.trc -rw-r–r– 1 grid oinstall 5 Apr 12 17:33 crsd_oraagent_grid.pid -bash-5.1# ps -ef | grep pmon grid 6650 1 0 17:34:19 ? 0:02 asm_pmon_+ASM2 root 41393 40748 0 09:34:57 pts/5 0:00 grep pmon

As result, we changed the ownership of following two files on problematic node.

-bash-5.1# ls -ltr crsd_oraagent_oracleOUT.trc crsd_oraagent_oracle.pid -rw-r–r– 1 grid oinstall 6 Apr 5 12:58 crsd_oraagent_oracle.pid -rw-r–r– 1 grid oinstall 1872 Apr 9 19:49 crsd_oraagent_oracleOUT.trc -bash-5.1# chown oracle:oinstall crsd_oraagent_oracleOUT.trc crsd_oraagent_oracle.pid

-bash-5.1# -bash-5.1# ls -ltr crsd_oraagent* -rw-r–r– 1 oracle oinstall 6 Apr 5 12:58 crsd_oraagent_oracle.pid -rw-r–r– 1 oracle oinstall 1872 Apr 9 19:49 crsd_oraagent_oracleOUT.trc -rw-r–r– 1 grid oinstall 5997 Apr 12 17:33 crsd_oraagent_gridOUT.trc -rw-r–r– 1 grid oinstall 5 Apr 12 17:33 crsd_oraagent_grid.pid -bash-5.1#

Post changes, database instance came up on node 2 without any issues.

srvctl start instance -d oracledb -i oracledb1

While starting Oracle Database using srvctl, database startup fails with error CRS-5802,PRCR-1079 ,CRS-2680,CRS-2680,CRS-5802

While starting Oracle Database using srvctl, database startup fails with given below error.

[oracle@node1 ~]$ srvctl start database -d prod

PRCR-1079 : Failed to start resource ora.prod.db

CRS-2680: Clean of ‘ora.prod.db’ on ‘node2’ failed

CRS-5802: Unable to start the agent process

CRS-2680: Clean of ‘ora.prod.db’ on ‘node1’ failed CRS-5802: Unable to start the agent process

The issue was caused by incorrect file permission :-

cd /u101/app/grid/crsdata/node1/output

[root@node1 output]# ls -lthr crsd_oraagent*

-rw-r–r–. 1 oragrid oinstall 1.2K Feb 16 01:54 crsd_oraagent_oracleOUT.trc

-rw-r–r–. 1 oragrid oinstall 5 Feb 16 01:54 crsd_oraagent_oracle.pid

-rw-r–r–. 1 oragrid oinstall 4.1K Feb 17 01:27 crsd_oraagent_oragridOUT.trc

-rw-r–r–. 1 oragrid oinstall 6 Feb 17 01:27 crsd_oraagent_oragrid.pid

[root@node1 output]#

[root@node1 output]#

[root@node1 output]# chown oracle:oinstall crsd_oraagent_oracleOUT.trc crsd_oraagent_oracle.pid

[root@node1 output]#

[root@node1 output]# ls -lthr crsd_oraagent*

-rw-r–r–. 1 oracle oinstall 1.2K Feb 16 01:54 crsd_oraagent_oracleOUT.trc

-rw-r–r–. 1 oracle oinstall 5 Feb 16 01:54 crsd_oraagent_oracle.pid

-rw-r–r–. 1 oragrid oinstall 4.1K Feb 17 01:27 crsd_oraagent_oragridOUT.trc

-rw-r–r–. 1 oragrid oinstall 6 Feb 17 01:27 crsd_oraagent_oragrid.pid

Note :-

The agent process for the database runs with the OS credential of the database user, when it starts it needs to update its PID file

Make changes in both the nodes and start your database

Happy Learning !!!!

CRS-5802: Unable to start the agent process

racdb资源状态显示offline,使用crs_start启动数据库报错如下:

ora.crm.db 1 OFFLINE OFFLINE crmdb1 2 OFFLINE OFFLINE crmdb2 root@crmdb1[/oracle/grid/app/product/11.2.0/gi/bin] # ./crs_start ora.crm.db Attempting to start `ora.crm.db` on member `crmdb2` Attempting to start `ora.crm.db` on member `crmdb1` Start of `ora.crm.db` on member `crmdb2` failed. Attempting to stop `ora.crm.db` on member `crmdb2` Start of `ora.crm.db` on member `crmdb1` failed. Attempting to stop `ora.crm.db` on member `crmdb1` `ora.crm.db` on member `crmdb2` has experienced an unrecoverable failure. Human intervention required to resume its availability. CRS-5802: Unable to start the agent process `ora.crm.db` on member `crmdb1` has experienced an unrecoverable failure. Human intervention required to resume its availability. CRS-5802: Unable to start the agent process CRS-0215: Could not start resource ‘ora.crm.db 1 1’. CRS-0215: Could not start resource ‘ora.crm.db 2 1’. ora.hxcrm.db 1 ONLINE UNKNOWN crmdb1 2 ONLINE UNKNOWN crmdb2

Starting Database Instance Using srvctl Fails With Errors PRCR-1013 CRS-2674 CRS-2678 CRS-5802

Recently we had an issue with one of the Exadata compute nodes where the database instances are not controlled by srvctl. When we use srvct…

Crs issue commands

Just for you: FREE 60-day trial to the world’s largest digital library.

The SlideShare family just got bigger. Enjoy access to millions of ebooks, audiobooks, magazines, and more from Scribd.

Cancel anytime.

‘srvctl start database’ Fails With PRCR-1013 PRCR-1064 CRS-2680 CRS-5802

Asset ID: 1-72-2309987.1 Update Date: 2017-09-29 Keywords:

Solution Type

Solution 2309987.1

Related Items Exadata X3-2 Hardware Related Categories PLA-Support>Eng Systems>Exadata/ODA/SSC>Oracle Exadata>DB: Exadata_EST

Applies to:

Symptoms

Created from Exadata X3-2 Hardware – Version All Versions to All Versions [Release All Releases]Information in this document applies to any platform.

SRVCTL cannot STOP or START database on one node after upgrading db from 11.2.0.4 to 12c.

SQLPLUS can start the database.

Status shows not running:

$ srvctl status database -d

Instance is not running on node node01

Instance is running on node node02

Instance is running on node node03

Attempts to start result in errors similar to the following:

$ srvctl start instance -d ints -i INTS1

PRCR-1013 : Failed to start resource ora..db

PRCR-1064 : Failed to start resource ora..db on node dbadm01

CRS-2680: Clean of ‘ora..db’ on ‘dbadm01’ failed

CRS-5802: Unable to start the agent process

CRS_alert.log will show errors like the following:

2017-09-13 11:09:05.223 [CRSD(280048)]CRS-5828: Could not start agent ‘/u01/app/12.1.0.2/grid/bin/oraagent’. Details at (:CRSAGF00126:) {1:19801:39244} in /u01/app/oracle/diag/crs/mydbadm01/crs/trace/crsd.trc. 2017-09-13 11:09:20.322 [CRSD(280048)]CRS-5828: Could not start agent ‘/u01/app/12.1.0.2/grid/bin/oraagent’. Details at (:CRSAGF00123:) {1:19801:39244} in /u01/app/oracle/diag/crs/mydbadm01/crs/trace/crsd.trc. 2017-09-13 11:09:20.322 [CRSD(280048)]CRS-5828: Could not start agent ‘/u01/app/12.1.0.2/grid/bin/oraagent’. Details at (:CRSAGF00126:) {1:19801:39244} in /u01/app/oracle/diag/crs/mydbadm01/crs/trace/crsd.trc. 2017-09-13 11:09:20.324 [CRSD(280048)]CRS-2758: Resource ‘ora.hcmprl.db’ is in an unknown state.

crsd.trc will show the following errors:

2017-09-13 11:09:20.322683 : AGFW:484337408: {1:19801:39244} Agfw Proxy Server received the message: RESOURCE_MODIFY_ATTR[ora.hcmprl.db 3 1] ID 4355:22953678

2017-09-13 11:09:20.322715 : AGFW:484337408: {1:19801:39244} Agfw Proxy Server rejecting message RESOURCE_MODIFY_ATTR[ora.hcmprl.db 3 1] ID 4355:22953678

2017-09-13 11:09:20.322722 : AGFW:484337408: {1:19801:39244} X_AGFW_RejectMsg : Could not find the resource: ora.hcmprl.db 3 1

(File: clsAgfwSrvResource.cpp, line: 1574 2017-09-13 11:09:20.323143 : CRSPE:473831168: {1:19801:39244} CRS-2678: ‘ora.hcmprl.db’ on ‘mydbadm01’ has experienced an unrecoverable failure 2017-09-13 11:09:20.323577 : CRSPE:473831168: {1:19801:39244} CRS-0267: Human intervention required to resume its availability. 2017-09-13 11:09:20.323607 :UiServer:4026529536: {1:19801:39244} Container [ Name: ORDER

MESSAGE:

TextMessage[CRS-2678: ‘ora.hcmprl.db’ on ‘mydbadm01’ has experienced an unrecoverable failure]

MSGTYPE:

TextMessage[1]

OBJID:

TextMessage[ora.hcmprl.db 1 1]

WAIT:

TextMessage[0]

]

2017-09-13 11:09:20.324109 : CRSD:473831168: {1:19801:39244} {1:19801:39244} Resourceora.hcmprl.db has failed into unknown state!

2017-09-13 11:09:20.324125 :UiServer:4026529536: {1:19801:39244} Container [ Name: ORDER

MESSAGE:

TextMessage[CRS-0267: Human intervention required to resume its availability.]

MSGTYPE:

TextMessage[1]

OBJID:

TextMessage[ora.hcmprl.db 1 1]

WAIT:

TextMessage[0]

]

2017-09-13 11:09:20.324135 : CRSPE:473831168: {1:19801:39244} Sequencer for [ora.hcmprl.db 1 1] has completed with error: CRS-5802: Unable to start the agent process 2017-09-13 11:09:20.324224 : CRSPE:473831168: {1:19801:39244} Starting resource state restoration for: START of [ora.hcmprl.db 1 1] on [mydbadm01] : Op:0x7f1cac532a80, Cmd:0x7f1cac398750, SeqId:16940 restart: , state change= 0, local restart=0

Status of the database resource will show one or more of them as UNKNOWN status.

ora..db

1 ONLINE UNKNOWN dbadm01 STABLE

2 ONLINE ONLINE dbadm02 Open,STABLE

3 ONLINE ONLINE dbadm03 Open,STABLE

Changes

Customer upgraded database from 11.2.0.4 to 12c.

Cause

Able to start DB using SQL*Plus, but not from srvctl command. The issue was caused by incorrect file ownership after upgrade. The sticky bit on agent folder was missing:

Solution

Solution 1:

./srvctl stop instance -d racdb -i racdb1 ./srvctl remove instance -d racdb -i racdb1 ./srvctl add instance -d racdb -i racdb1 -n racnd01 ./srvctl enable instance -d racdb -i racdb1 ./srvctl start instance -d racdb -i racdb1

Solution 2:

Setting the sticky bit will fixed the problem: $ chmod 1777 /407/apps/u01/oragi/11.2.0/grid/log/rac-node1/agent/crsd/oraagent_oracle $ ls -ld /407/apps/u01/oragi/11.2.0/grid/log/rac-node1/agent/crsd/oraagent_oracle

drwxrwxrwt. 2 oracle oinstall 4096 Aug 22 10:52 /407/apps/u01/oragi/11.2.0/grid/log/rac-node1/agent/crsd/oraagent_oracle

If this fails to resolve the issue, please collect a TFA covering a two hour period around the problem (see Doc ID 1513912.1) then open a Service Request with Oracle Support.

Attachments

This solution has no attachment

How to Troubleshoot Grid Infrastructure Startup Issues [ID 1050908.1]

修改时间 14-JUL-2011 类型 BULLETIN 状态 PUBLISHED

In this Document

Applies to:

Purpose

Scope and Application

How to Troubleshoot Grid Infrastructure Startup Issues

Start up sequence:

Cluster status

Oracle Server – Enterprise Edition – Version: 11.2.0.1 and later [Release: 11.2 and later ]Information in this document applies to any platform.This note is to provide reference to troubleshoot 11gR2 Grid Infrastructure clusterware startup issues. It applies to issues in both new environments (during root.sh or rootupgrade.sh) and unhealthy existing environments. To look specifically at root.sh issues, see for more information.This document is intended for Clusterware/RAC Database Administrators and Oracle support engineers.In a nutshell, the operating system starts ohasd, ohasd starts agents to start up daemons (gipcd, mdnsd, gpnpd, ctssd, ocssd, crsd, evmd asm etc), and crsd starts agents that start user resources (database, SCAN, listener etc).For detailed Grid Infrastructure clusterware startup sequence, please refer toTo find out cluster and daemon status:

$GRID_HOME/bin/crsctl check crs

CRS-4638: Oracle High Availability Services is online

CRS-4537: Cluster Ready Services is online

CRS-4529: Cluster Synchronization Services is online

CRS-4533: Event Manager is online

$GRID_HOME/bin/crsctl stat res -t -init

——————————————————————————–

NAME TARGET STATE SERVER STATE_DETAILS

——————————————————————————–

Cluster Resources

——————————————————————————–

ora.asm

1 ONLINE ONLINE rac1 Started

ora.crsd

1 ONLINE ONLINE rac1

ora.cssd

1 ONLINE ONLINE rac1

ora.cssdmonitor

1 ONLINE ONLINE rac1

ora.ctssd

1 ONLINE ONLINE rac1 OBSERVER

ora.diskmon

1 ONLINE ONLINE rac1

ora.drivers.acfs

1 ONLINE ONLINE rac1

ora.evmd

1 ONLINE ONLINE rac1

ora.gipcd

1 ONLINE ONLINE rac1

ora.gpnpd

1 ONLINE ONLINE rac1

ora.mdnsd

1 ONLINE ONLINE rac1

For 11.2.0.2 and above, there will be two more processes:

ora.cluster_interconnect.haip

1 ONLINE ONLINE rac1

ora.crf

1 ONLINE ONLINE rac1

$GRID_HOME/bin/crsctl start res ora.crsd -init

Case 1: OHASD does not start

cat /etc/inittab|grep init.ohasd

h1:35:respawn:/etc/init.d/init.ohasd run >/dev/null 2>&1

Above example shows CRS suppose to run at run level 3 and 5; please note depend on platform, CRS comes up at different run level.

To find out current run level:

who -r

2. “init.ohasd run” is up

On Linux/UNIX, as “init.ohasd run” is configured in /etc/inittab, process init (pid 1, /sbin/init on Linux, Solaris and hp-ux, /usr/sbin/init on AIX) will start and respawn “init.ohasd run” if it fails. Without “init.ohasd run” up and running, ohasd.bin will not start:

ps -ef|grep init.ohasd|grep -v grep

root 2279 1 0 18:14 ? 00:00:00 /bin/sh /etc/init.d/init.ohasd run

If any rc Snncommand script. (located in rcn.d, example S98gcstartup) stuck, init process may not start “/etc/init.d/init.ohasd run”; please engage OS vendor to find out why relevant Snncommand script. stuck.

3. Cluserware auto start is enabled – its enabled by default

By default CRS is enabled for auto start upon node reboot, to enable:

$GRID_HOME/bin/crsctl enable crs

To verify whether its currently enabled or not:

cat $SCRBASE/$HOSTNAME/root/ohasdstr

enable

SCRBASE is /etc/oracle/scls_scr on Linux and AIX, /var/opt/oracle/scls_scr on hp-ux and Solaris

Note: NEVER EDIT THE FILE MANUALLY, use “crsctl enable/disable crs” command instead.

4. syslogd is up and OS is able to execute init script. S96ohasd

OS may stuck with some other Snn script. while node is coming up, thus never get chance to execute S96ohasd; if that’s the case, following message will not be in OS messages:

Jan 20 20:46:51 rac1 logger: Oracle HA daemon is enabled for autostart.

If you don’t see above message, the other possibility is syslogd(/usr/sbin/syslogd) is not fully up. Grid may fail to come up in that case as well. This may not apply to AIX.

To find out whether OS is able to execute S96ohasd while node is coming up, modify ohasd:

From:

case `$CAT $AUTOSTARTFILE` in

enable*)

$LOGERR “Oracle HA daemon is enabled for autostart.”

To:

case `$CAT $AUTOSTARTFILE` in

enable*)

/bin/touch /tmp/ohasd.start.”`date`”

$LOGERR “Oracle HA daemon is enabled for autostart.”

After a node reboot, if you don’t see /tmp/ohasd.start.timestamp get created, it means OS stuck with some other Snn script. If you do see /tmp/ohasd.start.timestamp but not “Oracle HA daemon is enabled for autostart” in messages, likely syslogd is not fully up. For both case, you will need engage System Administrator to find out the issue on OS level. For latter case, the workaround is to “sleep” for about 2 minutes, modify ohasd:

From:

case `$CAT $AUTOSTARTFILE` in

enable*)

$LOGERR “Oracle HA daemon is enabled for autostart.”

To:

case `$CAT $AUTOSTARTFILE` in

enable*)

/bin/sleep 120

$LOGERR “Oracle HA daemon is enabled for autostart.”

5. File System that GRID_HOME resides is online when init script. S96ohasd is executed; once S96ohasd is executed, following message should be in OS messages file:

Jan 20 20:46:51 rac1 logger: Oracle HA daemon is enabled for autostart.

..

Jan 20 20:46:57 rac1 logger: exec /ocw/grid/perl/bin/perl -I/ocw/grid/perl/lib /ocw/grid/bin/crswrapexece.pl /ocw/grid/crs/install/s_crsconfig_rac1_env.txt /ocw/grid/bin/ohasd.bin “reboot”

If you see the first line, but not the last line, likely the filesystem containing the GRID_HOME was not online while S96ohasd is executed.

6. Oracle Local Registry (OLR, $GRID_HOME/cdata/${HOSTNAME}.olr) is accessible and valid

ls -l $GRID_HOME/cdata/*.olr

-rw——- 1 root oinstall 272756736 Feb 2 18:20 rac1.olr

If the OLR is inaccessible or corrupted, likely ohasd.log will have similar messages like following:

..

2010-01-24 22:59:10.470: [ default][1373676464] Initializing OLR

2010-01-24 22:59:10.472: [ OCROSD][1373676464]utopen:6m’:failed in stat OCR file/disk /ocw/grid/cdata/rac1.olr, errno=2, os err string=No such file or directory

2010-01-24 22:59:10.472: [ OCROSD][1373676464]utopen:7:failed to open any OCR file/disk, errno=2, os err string=No such file or directory

2010-01-24 22:59:10.473: [ OCRRAW][1373676464]proprinit: Could not open raw device

2010-01-24 22:59:10.473: [ OCRAPI][1373676464]a_init:16!: Backend init unsuccessful : [26]

2010-01-24 22:59:10.473: [ CRSOCR][1373676464] OCR context init failure. Error: PROCL-26: Error while accessing the physical storage Operating System error [No such file or directory] [2]

2010-01-24 22:59:10.473: [ default][1373676464] OLR initalization failured, rc=26

2010-01-24 22:59:10.474: [ default][1373676464]Created alert : (:OHAS00106:) : Failed to initialize Oracle Local Registry

2010-01-24 22:59:10.474: [ default][1373676464][PANIC] OHASD exiting; Could not init OLR

OR

..

2010-01-24 23:01:46.275: [ OCROSD][1228334000]utread:3: Problem reading buffer 1907f000 buflen 4096 retval 0 phy_offset 102400 retry 5

2010-01-24 23:01:46.275: [ OCRRAW][1228334000]propriogid:1_1: Failed to read the whole bootblock. Assumes invalid format.

2010-01-24 23:01:46.275: [ OCRRAW][1228334000]proprioini: all disks are not OCR/OLR formatted

2010-01-24 23:01:46.275: [ OCRRAW][1228334000]proprinit: Could not open raw device

2010-01-24 23:01:46.275: [ OCRAPI][1228334000]a_init:16!: Backend init unsuccessful : [26]

2010-01-24 23:01:46.276: [ CRSOCR][1228334000] OCR context init failure. Error: PROCL-26: Error while accessing the physical storage

2010-01-24 23:01:46.276: [ default][1228334000] OLR initalization failured, rc=26

2010-01-24 23:01:46.276: [ default][1228334000]Created alert : (:OHAS00106:) : Failed to initialize Oracle Local Registry

2010-01-24 23:01:46.277: [ default][1228334000][PANIC] OHASD exiting; Could not init OLR

OR

..

2010-11-07 03:00:08.932: [ default][1] Created alert : (:OHAS00102:) : OHASD is not running as privileged user

2010-11-07 03:00:08.932: [ default][1][PANIC] OHASD exiting: must be run as privileged user

OR

ohasd.bin comes up but output of “crsctl stat res -t -init”shows no resource, and “ocrconfig -local -manualbackup” fails

OR

..

2010-08-04 13:13:11.102: [ CRSPE][35] Resources parsed

2010-08-04 13:13:11.103: [ CRSPE][35] Server [] has been registered with the PE data model

2010-08-04 13:13:11.103: [ CRSPE][35] STARTUPCMD_REQ = false:

2010-08-04 13:13:11.103: [ CRSPE][35] Server [] has changed state from [Invalid/unitialized] to [VISIBLE]

2010-08-04 13:13:11.103: [ CRSOCR][31] Multi Write Batch processing…

2010-08-04 13:13:11.103: [ default][35] Dump State Starting …

..

2010-08-04 13:13:11.112: [ CRSPE][35] SERVERS:

:VISIBLE:address{{Absolute|Node:0|Process:-1|Type:1}}; recovered state:VISIBLE. Assigned to no pool

————- SERVER POOLS:

Free [min:0][max:-1][importance:0] NO SERVERS ASSIGNED

2010-08-04 13:13:11.113: [ CRSPE][35] Dumping ICE contents…:ICE operation count: 0

2010-08-04 13:13:11.113: [ default][35] Dump State Done.

The solution is to restore a good backup of OLR with “ocrconfig -local -restore “.

By default, OLR will be backed up to $GRID_HOME/cdata/$HOST/backup_$TIME_STAMP.olr once installation is complete.

7. ohasd.bin is able to access network socket files:

2010-06-29 10:31:01.570: [ COMMCRS][1206901056]clsclisten: Permission denied for (ADDRESS=(PROTOCOL=ipc)(KEY=procr_local_conn_0_PROL))

2010-06-29 10:31:01.571: [ OCRSRV][1217390912]th_listen: CLSCLISTEN failed clsc_ret= 3, addr= [(ADDRESS=(PROTOCOL=ipc)(KEY=procr_local_conn_0_PROL))]

2010-06-29 10:31:01.571: [ OCRSRV][3267002960]th_init: Local listener did not reach valid state

In Grid Infrastructure cluster environment, ohasd related socket files should be owned by root, but in Oracle Restart environment, they should be owned by grid user, refer to “Network Socket File Location, Ownership and Permission” section for example output.

8. ohasd.bin is able to access log file location:

CRS-4124: Oracle High Availability Services startup failed.

CRS-4000: Command Start failed, or completed with errors.

OS messages/syslog shows:

Feb 20 10:47:08 racnode1 OHASD[9566]: OHASD exiting; Directory /ocw/grid/log/racnode1/ohasd not found.

Refer to “Log File Location, Ownership and Permission” section for example output, if the expected directory is missing, create it with proper ownership and permission.

9. After node reboot, ohasd may fail to start on SUSE Linux after node reboot, refer to – OHASD not Starting After Reboot on SLES

10. OHASD fails to start, “ps -ef| grep ohasd.bin” shows ohasd.bin is started, but nothing in $GRID_HOME/log/ /ohasd/ohasd.log for many minutes, truss shows it is looping to close non-opened file handles:

..

15058/1: 0.1995 close(2147483646) Err#9 EBADF

15058/1: 0.1996 close(2147483645) Err#9 EBADF

..

Call stack of ohasd.bin from pstack shows the following:

_close sclssutl_closefiledescriptors main ..

The cause is which is fixed in 11.2.0.3 and above, other symptoms of the bug is clusterware processes may fail to start with same call stack and truss output (looping on OS call “close”). If the bug happens when trying to start other resources, “CRS-5802: Unable to start the agent process” could show up as well.

11. Other potential causes/solutions listed in – OHASD Failed to Start: Inappropriate ioctl for device

Case 2: OHASD Agents does not start

OHASD.BIN will spawn four agents/monitors to start resource:

oraagent: responsible for ora.asm, ora.evmd, ora.gipcd, ora.gpnpd, ora.mdnsd etc

orarootagent: responsible for ora.crsd, ora.ctssd, ora.diskmon, ora.drivers.acfs etc

cssdagent / cssdmonitor: responsible for ora.cssd(for ocssd.bin) and ora.cssdmonitor(for cssdmonitor itself)

If ohasd.bin can not start any of above agents properly, clusterware will not come to healthy state.

1. Common causes of agent failure are that the log file or log directory for the agents don’t have proper ownership or permission.

Refer to below section “Log File Location, Ownership and Permission” for general reference.

2. If agent binary (oraagent.bin or orarootagent.bin etc) is corrupted, agent will not start resulting in related resources not coming up:

2011-05-03 11:11:13.189

[ohasd(25303)]CRS-5828:Could not start agent ‘/ocw/grid/bin/orarootagent_grid’. Details at (:CRSAGF00130:) {0:0:2} in /ocw/grid/log/racnode1/ohasd/ohasd.log.

2011-05-03 12:03:17.491: [ AGFW][1117866336] {0:0:184} Created alert : (:CRSAGF00130:) : Failed to start the agent /ocw/grid/bin/orarootagent_grid

2011-05-03 12:03:17.491: [ AGFW][1117866336] {0:0:184} Agfw Proxy Server sending the last reply to PE for message:RESOURCE_START[ora.diskmon 1 1] ID 4098:403

2011-05-03 12:03:17.491: [ AGFW][1117866336] {0:0:184} Can not stop the agent: /ocw/grid/bin/orarootagent_grid because pid is not initialized

..

2011-05-03 12:03:17.492: [ CRSPE][1128372576] {0:0:184} Fatal Error from AGFW Proxy: Unable to start the agent process

2011-05-03 12:03:17.492: [ CRSPE][1128372576] {0:0:184} CRS-2674: Start of ‘ora.diskmon’ on ‘racnode1’ failed

..

2011-06-27 22:34:57.805: [ AGFW][1131669824] {0:0:2} Created alert : (:CRSAGF00123:) : Failed to start the agent process: /ocw/grid/bin/cssdagent Category: -1 Operation: fail Loc: canexec2 OS error: 0 Other : no exe permission, file [/ocw/grid/bin/cssdagent]

2011-06-27 22:34:57.805: [ AGFW][1131669824] {0:0:2} Created alert : (:CRSAGF00126:) : Agent start failed

..

2011-06-27 22:34:57.806: [ AGFW][1131669824] {0:0:2} Created alert : (:CRSAGF00123:) : Failed to start the agent process: /ocw/grid/bin/cssdmonitor Category: -1 Operation: fail Loc: canexec2 OS error: 0 Other : no exe permission, file [/ocw/grid/bin/cssdmonitor]

The solution is to compare agent binary with a “good” node, and restore a good copy.

3. Agent may fail to start due to with error “CRS-5802: Unable to start the agent process”, refer to Section “OHASD does not start” #10 for details.

Case 3: CSSD.BIN does not start

Successful cssd.bin startup depends on the following:

1. GPnP profile is accessible – gpnpd needs to be fully up to serve profile

If ocssd.bin is able to get the profile successfully, likely ocssd.log will have similar messages like following:

2010-02-02 18:00:16.251: [ GPnP][408926240]clsgpnpm_exchange: [at clsgpnpm.c:1175] Calling “ipc://GPNPD_rac1”, try 4 of 500…

2010-02-02 18:00:16.263: [ GPnP][408926240]clsgpnp_profileVerifyForCall: [at clsgpnp.c:1867] Result: (87) CLSGPNP_SIG_VALPEER. Profile verified. prf=0x165160d0

2010-02-02 18:00:16.263: [ GPnP][408926240]clsgpnp_profileGetSequenceRef: [at clsgpnp.c:841] Result: (0) CLSGPNP_OK. seq of p=0x165160d0 is ‘6’=6

2010-02-02 18:00:16.263: [ GPnP][408926240]clsgpnp_profileCallUrlInt: [at clsgpnp.c:2186] Result: (0) CLSGPNP_OK. Successful get-profile CALL to remote “ipc://GPNPD_rac1” disco “”

Otherwise messages like following will show in ocssd.log

2010-02-03 22:26:17.057: [ GPnP][3852126240]clsgpnpm_connect: [at clsgpnpm.c:1100] GIPC gipcretConnectionRefused (29) gipcConnect(ipc-ipc://GPNPD_rac1)

2010-02-03 22:26:17.057: [ GPnP][3852126240]clsgpnpm_connect: [at clsgpnpm.c:1101] Result: (48) CLSGPNP_COMM_ERR. Failed to connect to call url “ipc://GPNPD_rac1”

2010-02-03 22:26:17.057: [ GPnP][3852126240]clsgpnp_getProfileEx: [at clsgpnp.c:546] Result: (13) CLSGPNP_NO_DAEMON. Can’t get GPnP service profile from local GPnP daemon

2010-02-03 22:26:17.057: [ default][3852126240]Cannot get GPnP profile. Error CLSGPNP_NO_DAEMON (GPNPD daemon is not running).

2010-02-03 22:26:17.057: [ CSSD][3852126240]clsgpnp_getProfile failed, rc(13)

2. Voting Disk is accessible

In 11gR2, ocssd.bin discover voting disk with setting from GPnP profile, if not enough voting disks can be identified, ocssd.bin will abort itself.

2010-02-03 22:37:22.212: [ CSSD][2330355744]clssnmReadDiscoveryProfile: voting file discovery string(/share/storage/di*)

..

2010-02-03 22:37:22.227: [ CSSD][1145538880]clssnmvDiskVerify: Successful discovery of 0 disks

2010-02-03 22:37:22.227: [ CSSD][1145538880]clssnmCompleteInitVFDiscovery: Completing initial voting file discovery

2010-02-03 22:37:22.227: [ CSSD][1145538880]clssnmvFindInitialConfigs: No voting files found

2010-02-03 22:37:22.228: [ CSSD][1145538880]###################################

2010-02-03 22:37:22.228: [ CSSD][1145538880]clssscExit: CSSD signal 11 in thread clssnmvDDiscThread

ocssd.bin may not come up with the following error if all nodes failed while there’s a voting file change in progress:

2010-05-02 03:11:19.033: [ CSSD][1197668093]clssnmCompleteInitVFDiscovery: Detected voting file add in progress for CIN 0:1134513465:0, waiting for configuration to complete 0:1134513098:0

The solution is to start ocssd.bin in exclusive mode with

If the voting disk is located on a non-ASM device, ownership and permissions should be:

-rw-r—– 1 ogrid oinstall 21004288 Feb 4 09:13 votedisk1

3. Network is functional and name resolution is working:

If ocssd.bin can’t bind to any network, likely the ocssd.log will have messages like following:

2010-02-03 23:26:25.804: [GIPCXCPT][1206540320]gipcmodGipcPassInitializeNetwork: failed to find any interfaces in clsinet, ret gipcretFail (1)

2010-02-03 23:26:25.804: [GIPCGMOD][1206540320]gipcmodGipcPassInitializeNetwork: EXCEPTION[ ret gipcretFail (1) ] failed to determine host from clsinet, using default

..

2010-02-03 23:26:25.810: [ CSSD][1206540320]clsssclsnrsetup: gipcEndpoint failed, rc 39

2010-02-03 23:26:25.811: [ CSSD][1206540320]clssnmOpenGIPCEndp: failed to listen on gipc addr gipc://rac1:nm_eotcs- ret 39

2010-02-03 23:26:25.811: [ CSSD][1206540320]clssscmain: failed to open gipc endp

If there’s connectivity issue on private network (including multicast is off), likely the ocssd.log will have messages like following:

2010-09-20 11:52:54.014: [ CSSD][1103055168]clssnmvDHBValidateNCopy: node 1, racnode1, has a disk HB, but no network HB, DHB has rcfg 180441784, wrtcnt, 453, LATS 328297844, lastSeqNo 452, uniqueness 1284979488, timestamp 1284979973/329344894

2010-09-20 11:52:54.016: [ CSSD][1078421824]clssgmWaitOnEventValue: after CmInfo State val 3, eval 1 waited 0

.. >>>> after a long delay

2010-09-20 12:02:39.578: [ CSSD][1103055168]clssnmvDHBValidateNCopy: node 1, racnode1, has a disk HB, but no network HB, DHB has rcfg 180441784, wrtcnt, 1037, LATS 328883434, lastSeqNo 1036, uniqueness 1284979488, timestamp 1284980558/329930254

2010-09-20 12:02:39.895: [ CSSD][1107286336]clssgmExecuteClientRequest: MAINT recvd from proc 2 (0xe1ad870)

2010-09-20 12:02:39.895: [ CSSD][1107286336]clssgmShutDown: Received abortive shutdown request from client.

2010-09-20 12:02:39.895: [ CSSD][1107286336]###################################

2010-09-20 12:02:39.895: [ CSSD][1107286336]clssscExit: CSSD aborting from thread GMClientListener

2010-09-20 12:02:39.895: [ CSSD][1107286336]###################################

To validate network, please refer to

4. Vendor clusterware is up (if using vendor clusterware)

Grid Infrastructure provide full clusterware functionality and doesn’t need Vendor clusterware to be installed; but if you happened to have Grid Infrastructure on top of Vendor clusterware in your environment, then Vendor clusterware need to come up fully before CRS can be started, to verify, as grid user:

$GRID_HOME/bin/lsnodes -n

racnode1 1

racnode1 0

If vendor clusterware is not fully up, likely ocssd.log will have similar messages like following:

2010-08-30 18:28:13.207: [ CSSD][36]clssnm_skgxninit: skgxncin failed, will retry

2010-08-30 18:28:14.207: [ CSSD][36]clssnm_skgxnmon: skgxn init failed

2010-08-30 18:28:14.208: [ CSSD][36]###################################

2010-08-30 18:28:14.208: [ CSSD][36]clssscExit: CSSD signal 11 in thread skgxnmon

Before the clusterware is installed, execute the command below as grid user:

$INSTALL_SOURCE/install/lsnodes -v

Case 4: CRSD.BIN does not start

Successful crsd.bin startup depends on the following:

1. ocssd is fully up

If ocssd.bin is not fully up, crsd.log will show messages like following:

2010-02-03 22:37:51.638: [ CSSCLNT][1548456880]clssscConnect: gipc request failed with 29 (0x16)

2010-02-03 22:37:51.638: [ CSSCLNT][1548456880]clsssInitNative: connect failed, rc 29

2010-02-03 22:37:51.639: [ CRSRTI][1548456880] CSS is not ready. Received status 3 from CSS. Waiting for good status ..

2. OCR is accessible

If the OCR is located on ASM and it’s unavailable, likely the crsd.log will show messages like: 2010-02-03 22:22:55.186: [ OCRASM][2603807664]proprasmo: Error in open/create file in dg [GI] [ OCRASM][2603807664]SLOS : SLOS: cat=7, pn=kgfoAl06, dep=15077, loc=kgfokge

ORA-15077: could not locate ASM instance serving a required diskgroup

2010-02-03 22:22:55.189: [ OCRASM][2603807664]proprasmo: kgfoCheckMount returned [7]

2010-02-03 22:22:55.189: [ OCRASM][2603807664]proprasmo: The ASM instance is down

2010-02-03 22:22:55.190: [ OCRRAW][2603807664]proprioo: Failed to open [+GI]. Returned proprasmo() with [26]. Marking location as UNAVAILABLE.

2010-02-03 22:22:55.190: [ OCRRAW][2603807664]proprioo: No OCR/OLR devices are usable

2010-02-03 22:22:55.190: [ OCRASM][2603807664]proprasmcl: asmhandle is NULL

2010-02-03 22:22:55.190: [ OCRRAW][2603807664]proprinit: Could not open raw device

2010-02-03 22:22:55.190: [ OCRASM][2603807664]proprasmcl: asmhandle is NULL

2010-02-03 22:22:55.190: [ OCRAPI][2603807664]a_init:16!: Backend init unsuccessful : [26]

2010-02-03 22:22:55.190: [ CRSOCR][2603807664] OCR context init failure. Error: PROC-26: Error while accessing the physical storage ASM error [SLOS: cat=7, pn=kgfoAl06, dep=15077, loc=kgfokge

ORA-15077: could not locate ASM instance serving a required diskgroup

] [7]

2010-02-03 22:22:55.190: [ CRSD][2603807664][PANIC] CRSD exiting: Could not init OCR, code: 26

Note: in 11.2 ASM starts before crsd.bin, and brings up the diskgroup automatically if it contains the OCR.

If the OCR is located on a non-ASM device, expected ownership and permissions are:

-rw-r—– 1 root oinstall 272756736 Feb 3 23:24 ocr

If OCR is located on non-ASM device and its unavailable, likely crsd.log will show similar message like following:

2010-02-03 23:14:33.583: [ OCROSD][2346668976]utopen:7:failed to open any OCR file/disk, errno=2, os err string=No such file or directory

2010-02-03 23:14:33.583: [ OCRRAW][2346668976]proprinit: Could not open raw device

2010-02-03 23:14:33.583: [ default][2346668976]a_init:7!: Backend init unsuccessful : [26]

2010-02-03 23:14:34.587: [ OCROSD][2346668976]utopen:6m’:failed in stat OCR file/disk /share/storage/ocr, errno=2, os err string=No such file or directory

2010-02-03 23:14:34.587: [ OCROSD][2346668976]utopen:7:failed to open any OCR file/disk, errno=2, os err string=No such file or directory

2010-02-03 23:14:34.587: [ OCRRAW][2346668976]proprinit: Could not open raw device

2010-02-03 23:14:34.587: [ default][2346668976]a_init:7!: Backend init unsuccessful : [26]

2010-02-03 23:14:35.589: [ CRSD][2346668976][PANIC] CRSD exiting: OCR device cannot be initialized, error: 1:26

If the OCR is corrupted, likely crsd.log will show messages like the following:

2010-02-03 23:19:38.417: [ default][3360863152]a_init:7!: Backend init unsuccessful : [26]

2010-02-03 23:19:39.429: [ OCRRAW][3360863152]propriogid:1_2: INVALID FORMAT

2010-02-03 23:19:39.429: [ OCRRAW][3360863152]proprioini: all disks are not OCR/OLR formatted

2010-02-03 23:19:39.429: [ OCRRAW][3360863152]proprinit: Could not open raw device

2010-02-03 23:19:39.429: [ default][3360863152]a_init:7!: Backend init unsuccessful : [26]

2010-02-03 23:19:40.432: [ CRSD][3360863152][PANIC] CRSD exiting: OCR device cannot be initialized, error: 1:26

If owner or group of grid user got changed, even ASM is available, likely crsd.log will show following:

2010-03-10 11:45:12.510: [ OCRASM][611467760]proprasmo: Error in open/create file in dg [SYSTEMDG] [ OCRASM][611467760]SLOS : SLOS: cat=7, pn=kgfoAl06, dep=1031, loc=kgfokge

ORA-01031: insufficient privileges

2010-03-10 11:45:12.528: [ OCRASM][611467760]proprasmo: kgfoCheckMount returned [7]

2010-03-10 11:45:12.529: [ OCRASM][611467760]proprasmo: The ASM instance is down

2010-03-10 11:45:12.529: [ OCRRAW][611467760]proprioo: Failed to open [+SYSTEMDG]. Returned proprasmo() with [26]. Marking location as UNAVAILABLE.

2010-03-10 11:45:12.529: [ OCRRAW][611467760]proprioo: No OCR/OLR devices are usable

2010-03-10 11:45:12.529: [ OCRASM][611467760]proprasmcl: asmhandle is NULL

2010-03-10 11:45:12.529: [ OCRRAW][611467760]proprinit: Could not open raw device

2010-03-10 11:45:12.529: [ OCRASM][611467760]proprasmcl: asmhandle is NULL

2010-03-10 11:45:12.529: [ OCRAPI][611467760]a_init:16!: Backend init unsuccessful : [26]

2010-03-10 11:45:12.530: [ CRSOCR][611467760] OCR context init failure. Error: PROC-26: Error while accessing the physical storage ASM error [SLOS: cat=7, pn=kgfoAl06, dep=1031, loc=kgfokge

ORA-01031: insufficient privileges

] [7]

If OCR or mirror is unavailable (could be ASM is up, but diskgroup for OCR/mirror is unmounted), likely crsd.log will show following:

2010-05-11 11:16:38.578: [ OCRASM][18]proprasmo: Error in open/create file in dg [OCRMIR] [ OCRASM][18]SLOS : SLOS: cat=8, pn=kgfoOpenFile01, dep=15056, loc=kgfokge

ORA-17503: ksfdopn:DGOpenFile05 Failed to open file +OCRMIR.255.4294967295

ORA-17503: ksfdopn:2 Failed to open file +OCRMIR.255.4294967295

ORA-15001: diskgroup “OCRMIR

..

2010-05-11 11:16:38.647: [ OCRASM][18]proprasmo: kgfoCheckMount returned [6]

2010-05-11 11:16:38.648: [ OCRASM][18]proprasmo: The ASM disk group OCRMIR is not found or not mounted

2010-05-11 11:16:38.648: [ OCRASM][18]proprasmdvch: Failed to open OCR location [+OCRMIR] error [26]

2010-05-11 11:16:38.648: [ OCRRAW][18]propriodvch: Error [8] returned device check for [+OCRMIR]

2010-05-11 11:16:38.648: [ OCRRAW][18]dev_replace: non-master could not verify the new disk (8)

[ OCRSRV][18]proath_invalidate_action: Failed to replace [+OCRMIR] [8] [ OCRAPI][18]procr_ctx_set_invalid_no_abort: ctx set to invalid

..

2010-05-11 11:16:46.587: [ OCRMAS][19]th_master:91: Comparing device hash ids between local and master failed

2010-05-11 11:16:46.587: [ OCRMAS][19]th_master:91 Local dev (1862408427, 1028247821, 0, 0, 0)

2010-05-11 11:16:46.587: [ OCRMAS][19]th_master:91 Master dev (1862408427, 1859478705, 0, 0, 0)

2010-05-11 11:16:46.587: [ OCRMAS][19]th_master:9: Shutdown CacheLocal. my hash ids don’t match

[ OCRAPI][19]procr_ctx_set_invalid_no_abort: ctx set to invalid

[ OCRAPI][19]procr_ctx_set_invalid: aborting…

2010-05-11 11:16:46.587: [ CRSD][19] Dump State Starting …

3. crsd.bin pid file exists and points to running crsd.bin process

If pid file does not exist, $GRID_HOME/log/$HOST/agent/ohasd/orarootagent_root/orarootagent_root.log will have similar like the following:

2010-02-14 17:40:57.927: [ora.crsd][1243486528] [check] PID FILE doesn’t exist.

..

2010-02-14 17:41:57.927: [ clsdmt][1092499776]Creating PID [30269] file for home /ocw/grid host racnode1 bin crs to /ocw/grid/crs/init/

2010-02-14 17:41:57.927: [ clsdmt][1092499776]Error3 -2 writing PID [30269] to the file []

2010-02-14 17:41:57.927: [ clsdmt][1092499776]Failed to record pid for CRSD

2010-02-14 17:41:57.927: [ clsdmt][1092499776]Terminating process

2010-02-14 17:41:57.927: [ default][1092499776] CRSD exiting on stop request from clsdms_thdmai

The solution is to create a dummy pid file ($GRID_HOME/crs/init/$HOST.pid) manually as grid user with “touch” command and restart resource ora.crsd

If pid file does exist but does not point to running crsd.bin process, $GRID_HOME/log/$HOST/agent/ohasd/orarootagent_root/orarootagent_root.log will have similar like the following:

2011-04-06 15:53:38.777: [ora.crsd][1160390976] [check] PID will be looked for in /ocw/grid/crs/init/racnode1.pid

2011-04-06 15:53:38.778: [ora.crsd][1160390976] [check] PID which will be monitored will be 1535 >> 1535 is output of “cat /ocw/grid/crs/init/racnode1.pid”

2011-04-06 15:53:38.965: [ COMMCRS][1191860544]clsc_connect: (0x2aaab400b0b0) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=racnode1DBG_CRSD))

[ clsdmc][1160390976]Fail to connect (ADDRESS=(PROTOCOL=ipc)(KEY=racnode1DBG_CRSD)) with status 9

2011-04-06 15:53:38.966: [ora.crsd][1160390976] [check] Error = error 9 encountered when connecting to CRSD

2011-04-06 15:53:39.023: [ora.crsd][1160390976] [check] Calling PID check for daemon

2011-04-06 15:53:39.023: [ora.crsd][1160390976] [check] Trying to check PID = 1535

2011-04-06 15:53:39.203: [ora.crsd][1160390976] [check] PID check returned ONLINE CLSDM returned OFFLINE

2011-04-06 15:53:39.203: [ora.crsd][1160390976] [check] DaemonAgent::check returned 5

2011-04-06 15:53:39.203: [ AGFW][1160390976] check for resource: ora.crsd 1 1 completed with status: FAILED

2011-04-06 15:53:39.203: [ AGFW][1170880832] ora.crsd 1 1 state changed from: UNKNOWN to: FAILED

..

2011-04-06 15:54:10.511: [ AGFW][1167522112] ora.crsd 1 1 state changed from: UNKNOWN to: CLEANING

..

2011-04-06 15:54:10.513: [ora.crsd][1146542400] [clean] Trying to stop PID = 1535

..

2011-04-06 15:54:11.514: [ora.crsd][1146542400] [clean] Trying to check PID = 1535

To verify on OS level:

ls -l /ocw/grid/crs/init/*pid

-rwxr-xr-x 1 ogrid oinstall 5 Feb 17 11:00 /ocw/grid/crs/init/racnode1.pid

cat /ocw/grid/crs/init/*pid

1535

ps -ef| grep 1535

root 1535 1 0 Mar30 ? 00:00:00 iscsid >> Note process 1535 is not crsd.bin

The solution is to emptify pid file and restart resource ora.crsd

4. Network is functional and name resolution is working:

If the network is not fully functioning, ocssd.bin may still come up, but crsd.bin may fail and the crsd.log will show messages like:

2010-02-03 23:34:28.412: [ GPnP][2235814832]clsgpnp_Init: [at clsgpnp0.c:837] GPnP client pid=867, tl=3, f=0

2010-02-03 23:34:28.428: [ OCRAPI][2235814832]clsu_get_private_ip_addresses: no ip addresses found.

..

2010-02-03 23:34:28.434: [ OCRAPI][2235814832]a_init:13!: Clusterware init unsuccessful : [44]

2010-02-03 23:34:28.434: [ CRSOCR][2235814832] OCR context init failure. Error: PROC-44: Error in network address and interface operations Network address and interface operations error [7]

2010-02-03 23:34:28.434: [ CRSD][2235814832][PANIC] CRSD exiting: Could not init OCR, code: 44

Or:

2009-12-10 06:28:31.974: [ OCRMAS][20]proath_connect_master:1: could not connect to master clsc_ret1 = 9, clsc_ret2 = 9

2009-12-10 06:28:31.974: [ OCRMAS][20]th_master:11: Could not connect to the new master

2009-12-10 06:29:01.450: [ CRSMAIN][2] Policy Engine is not initialized yet!

2009-12-10 06:29:31.489: [ CRSMAIN][2] Policy Engine is not initialized yet!

Or:

2009-12-31 00:42:08.110: [ COMMCRS][10]clsc_receive: (102b03250) Error receiving, ns (12535, 12560), transport (505, 145, 0)

To validate the network, please refer to

5. To troubleshoot further, refer to – Troubleshooting CRSD Start up Issue

Case 5: GPNPD.BIN does not start 1. Name Resolution is not working

gpnpd.bin fails with following error in gpnpd.log:

2010-05-13 12:48:11.540: [ GPnP][1171126592]clsgpnpm_exchange: [at clsgpnpm.c:1175] Calling “tcp://node2:9393”, try 1 of 3…

2010-05-13 12:48:11.540: [ GPnP][1171126592]clsgpnpm_connect: [at clsgpnpm.c:1015] ENTRY

2010-05-13 12:48:11.541: [ GPnP][1171126592]clsgpnpm_connect: [at clsgpnpm.c:1066] GIPC gipcretFail (1) gipcConnect(tcp-tcp://node2:9393)

2010-05-13 12:48:11.541: [ GPnP][1171126592]clsgpnpm_connect: [at clsgpnpm.c:1067] Result: (48) CLSGPNP_COMM_ERR. Failed to connect to call url “tcp://node2:9393”

In above example, please make sure current node is able to ping “node2”, and no firewall between them.

Case 6: Various other daemons do not start Two common causes:

1. Log file or directory for the daemon doesn’t have appropriate ownership or permission

If the log file or log directory for the daemon doesn’t have proper ownership or permissions, usually there is no new info in the log file and the timestamp remains the same while the daemon tries to come up.

Refer to below section “Log File Location, Ownership and Permission” for general reference.

2. Network socket file doesn’t have appropriate ownership or permission

In this case, the daemon log will show messages like:

2010-02-02 12:55:20.485: [ COMMCRS][1121433920]clsclisten: Permission denied for (ADDRESS=(PROTOCOL=ipc)(KEY=rac1DBG_GIPCD))

2010-02-02 12:55:20.485: [ clsdmt][1110944064]Fail to listen to (ADDRESS=(PROTOCOL=ipc)(KEY=rac1DBG_GIPCD))

Case 7: CRSD Agents do not start

CRSD.BIN will spawn two agents to start up user resource -the two agent share same name and binary as ohasd.bin agents:

orarootagent: responsible for ora.netn.network, ora.nodename.vip, ora.scann.vip and ora.gns

oraagent: responsible for ora.asm, ora.eons, ora.ons, listener, SCAN listener, diskgroup, database, service resource etc

To find out the user resource status:

$GRID_HOME/crsctl stat res -t

If crsd.bin can not start any of the above agents properly, user resources may not come up.

1. Common cause of agent failure is that the log file or log directory for the agents don’t have proper ownership or permissions.

Refer to below section “Log File Location, Ownership and Permission” for general reference.

2. Agent may fail to start due to with error “CRS-5802: Unable to start the agent process”, refer to Section “OHASD does not start” #10 for details.

Network and Naming Resolution Verification

CRS depends on a fully functional network and name resolution. If the network or name resolution is not fully functioning, CRS may not come up successfully.

To validate network and name resolution setup, please refer to

Log File Location, Ownership and Permission

Appropriate ownership and permission of sub-directories and files in $GRID_HOME/log is critical for CRS components to come up properly.

In Grid Infrastructure cluster environment: Assuming a Grid Infrastructure environment with node name rac1, CRS owner grid, and two separate RDBMS owner rdbmsap and rdbmsar, here’s what it looks like under $GRID_HOME/log in cluster environment:

drwxrwxr-x 5 grid oinstall 4096 Dec 6 09:20 log

drwxr-xr-x 2 grid oinstall 4096 Dec 6 08:36 crs

drwxr-xr-t 17 root oinstall 4096 Dec 6 09:22 rac1

drwxr-x— 2 grid oinstall 4096 Dec 6 09:20 admin

drwxrwxr-t 4 root oinstall 4096 Dec 6 09:20 agent

drwxrwxrwt 7 root oinstall 4096 Jan 26 18:15 crsd

drwxr-xr-t 2 grid oinstall 4096 Dec 6 09:40 application_grid

drwxr-xr-t 2 grid oinstall 4096 Jan 26 18:15 oraagent_grid

drwxr-xr-t 2 rdbmsap oinstall 4096 Jan 26 18:15 oraagent_rdbmsap

drwxr-xr-t 2 rdbmsar oinstall 4096 Jan 26 18:15 oraagent_rdbmsar

drwxr-xr-t 2 grid oinstall 4096 Jan 26 18:15 ora_oc4j_type_grid

drwxr-xr-t 2 root root 4096 Jan 26 20:09 orarootagent_root

drwxrwxr-t 6 root oinstall 4096 Dec 6 09:24 ohasd

drwxr-xr-t 2 grid oinstall 4096 Jan 26 18:14 oraagent_grid

drwxr-xr-t 2 root root 4096 Dec 6 09:24 oracssdagent_root

drwxr-xr-t 2 root root 4096 Dec 6 09:24 oracssdmonitor_root

drwxr-xr-t 2 root root 4096 Jan 26 18:14 orarootagent_root

-rw-rw-r– 1 root root 12931 Jan 26 21:30 alertrac1.log

drwxr-x— 2 grid oinstall 4096 Jan 26 20:44 client

drwxr-x— 2 root oinstall 4096 Dec 6 09:24 crsd

drwxr-x— 2 grid oinstall 4096 Dec 6 09:24 cssd

drwxr-x— 2 root oinstall 4096 Dec 6 09:24 ctssd

drwxr-x— 2 grid oinstall 4096 Jan 26 18:14 diskmon

drwxr-x— 2 grid oinstall 4096 Dec 6 09:25 evmd

drwxr-x— 2 grid oinstall 4096 Jan 26 21:20 gipcd

drwxr-x— 2 root oinstall 4096 Dec 6 09:20 gnsd

drwxr-x— 2 grid oinstall 4096 Jan 26 20:58 gpnpd

drwxr-x— 2 grid oinstall 4096 Jan 26 21:19 mdnsd

drwxr-x— 2 root oinstall 4096 Jan 26 21:20 ohasd

drwxrwxr-t 5 grid oinstall 4096 Dec 6 09:34 racg

drwxrwxrwt 2 grid oinstall 4096 Dec 6 09:20 racgeut

drwxrwxrwt 2 grid oinstall 4096 Dec 6 09:20 racgevtf

drwxrwxrwt 2 grid oinstall 4096 Dec 6 09:20 racgmain

drwxr-x— 2 grid oinstall 4096 Jan 26 20:57 srvm

Please note most log files in sub-directory inherit ownership of parent directory; and above are just for general reference to tell whether there’s unexpected recursive ownership and permission changes inside the CRS home . If you have a working node with the same version, the working node should be used as a reference.

In Oracle Restart environment: And here’s what it looks like under $GRID_HOME/log in Oracle Restart environment:

drwxrwxr-x 5 grid oinstall 4096 Oct 31 2009 log

drwxr-xr-x 2 grid oinstall 4096 Oct 31 2009 crs

drwxr-xr-x 3 grid oinstall 4096 Oct 31 2009 diag

drwxr-xr-t 17 root oinstall 4096 Oct 31 2009 rac1

drwxr-x— 2 grid oinstall 4096 Oct 31 2009 admin

drwxrwxr-t 4 root oinstall 4096 Oct 31 2009 agent

drwxrwxrwt 2 root oinstall 4096 Oct 31 2009 crsd

drwxrwxr-t 8 root oinstall 4096 Jul 14 08:15 ohasd

drwxr-xr-x 2 grid oinstall 4096 Aug 5 13:40 oraagent_grid

drwxr-xr-x 2 grid oinstall 4096 Aug 2 07:11 oracssdagent_grid

drwxr-xr-x 2 grid oinstall 4096 Aug 3 21:13 orarootagent_grid

-rwxr-xr-x 1 grid oinstall 13782 Aug 1 17:23 alertrac1.log

drwxr-x— 2 grid oinstall 4096 Nov 2 2009 client

drwxr-x— 2 root oinstall 4096 Oct 31 2009 crsd

drwxr-x— 2 grid oinstall 4096 Oct 31 2009 cssd

drwxr-x— 2 root oinstall 4096 Oct 31 2009 ctssd

drwxr-x— 2 grid oinstall 4096 Oct 31 2009 diskmon

drwxr-x— 2 grid oinstall 4096 Oct 31 2009 evmd

drwxr-x— 2 grid oinstall 4096 Oct 31 2009 gipcd

drwxr-x— 2 root oinstall 4096 Oct 31 2009 gnsd

drwxr-x— 2 grid oinstall 4096 Oct 31 2009 gpnpd

drwxr-x— 2 grid oinstall 4096 Oct 31 2009 mdnsd

drwxr-x— 2 grid oinstall 4096 Oct 31 2009 ohasd

drwxrwxr-t 5 grid oinstall 4096 Oct 31 2009 racg

drwxrwxrwt 2 grid oinstall 4096 Oct 31 2009 racgeut

drwxrwxrwt 2 grid oinstall 4096 Oct 31 2009 racgevtf

drwxrwxrwt 2 grid oinstall 4096 Oct 31 2009 racgmain

drwxr-x— 2 grid oinstall 4096 Oct 31 2009 srvm

Network Socket File Location, Ownership and Permission

Network socket files can be located in /tmp/.oracle, /var/tmp/.oracle or /usr/tmp/.oracle

When socket file has unexpected ownership or permission, usually daemon log file (i.e. evmd.log) will have the following:

2011-06-18 14:07:28.545: [ COMMCRS][772]clsclisten: Permission denied for (ADDRESS=(PROTOCOL=ipc)(KEY=racnode1DBG_EVMD))

2011-06-18 14:07:28.545: [ clsdmt][515]Fail to listen to (ADDRESS=(PROTOCOL=ipc)(KEY=lena042DBG_EVMD))

2011-06-18 14:07:28.545: [ clsdmt][515]Terminating process

2011-06-18 14:07:28.559: [ default][515] EVMD exiting on stop request from clsdms_thdmai

And the following error may be reported:

CRS-5017: The resource action “ora.evmd start” encountered the following error:

CRS-2674: Start of ‘ora.evmd’ on ‘racnode1’ failed

..

The solution is to stop GI as root (crsctl stop crs -f), clean up socket files and restart GI.

Assuming a Grid Infrastructure environment with node name rac1, CRS owner grid, and clustername eotcs

In Grid Infrastructure cluster environment: Below is an example output from cluster environment:

drwxrwxrwt 2 root oinstall 4096 Feb 2 21:25 .oracle

./.oracle:

drwxrwxrwt 2 root oinstall 4096 Feb 2 21:25 .

srwxrwx— 1 grid oinstall 0 Feb 2 18:00 master_diskmon

srwxrwxrwx 1 grid oinstall 0 Feb 2 18:00 mdnsd

-rw-r–r– 1 grid oinstall 5 Feb 2 18:00 mdnsd.pid

prw-r–r– 1 root root 0 Feb 2 13:33 npohasd

srwxrwxrwx 1 grid oinstall 0 Feb 2 18:00 ora_gipc_GPNPD_rac1

-rw-r–r– 1 grid oinstall 0 Feb 2 13:34 ora_gipc_GPNPD_rac1_lock

srwxrwxrwx 1 grid oinstall 0 Feb 2 13:39 s#11724.1

srwxrwxrwx 1 grid oinstall 0 Feb 2 13:39 s#11724.2

srwxrwxrwx 1 grid oinstall 0 Feb 2 13:39 s#11735.1

srwxrwxrwx 1 grid oinstall 0 Feb 2 13:39 s#11735.2

srwxrwxrwx 1 grid oinstall 0 Feb 2 13:45 s#12339.1

srwxrwxrwx 1 grid oinstall 0 Feb 2 13:45 s#12339.2

srwxrwxrwx 1 grid oinstall 0 Feb 2 18:01 s#6275.1

srwxrwxrwx 1 grid oinstall 0 Feb 2 18:01 s#6275.2

srwxrwxrwx 1 grid oinstall 0 Feb 2 18:01 s#6276.1

srwxrwxrwx 1 grid oinstall 0 Feb 2 18:01 s#6276.2

srwxrwxrwx 1 grid oinstall 0 Feb 2 18:01 s#6278.1

srwxrwxrwx 1 grid oinstall 0 Feb 2 18:01 s#6278.2

srwxrwxrwx 1 grid oinstall 0 Feb 2 18:00 sAevm

srwxrwxrwx 1 grid oinstall 0 Feb 2 18:00 sCevm

srwxrwxrwx 1 root root 0 Feb 2 18:01 sCRSD_IPC_SOCKET_11

srwxrwxrwx 1 root root 0 Feb 2 18:01 sCRSD_UI_SOCKET

srwxrwxrwx 1 root root 0 Feb 2 21:25 srac1DBG_CRSD

srwxrwxrwx 1 grid oinstall 0 Feb 2 18:00 srac1DBG_CSSD

srwxrwxrwx 1 root root 0 Feb 2 18:00 srac1DBG_CTSSD

srwxrwxrwx 1 grid oinstall 0 Feb 2 18:00 srac1DBG_EVMD

srwxrwxrwx 1 grid oinstall 0 Feb 2 18:00 srac1DBG_GIPCD

srwxrwxrwx 1 grid oinstall 0 Feb 2 18:00 srac1DBG_GPNPD

srwxrwxrwx 1 grid oinstall 0 Feb 2 18:00 srac1DBG_MDNSD

srwxrwxrwx 1 root root 0 Feb 2 18:00 srac1DBG_OHASD

srwxrwxrwx 1 grid oinstall 0 Feb 2 18:01 sLISTENER

srwxrwxrwx 1 grid oinstall 0 Feb 2 18:01 sLISTENER_SCAN2

srwxrwxrwx 1 grid oinstall 0 Feb 2 18:01 sLISTENER_SCAN3

srwxrwxrwx 1 grid oinstall 0 Feb 2 18:00 sOCSSD_LL_rac1_

srwxrwxrwx 1 grid oinstall 0 Feb 2 18:00 sOCSSD_LL_rac1_eotcs

-rw-r–r– 1 grid oinstall 0 Feb 2 18:00 sOCSSD_LL_rac1_eotcs_lock

-rw-r–r– 1 grid oinstall 0 Feb 2 18:00 sOCSSD_LL_rac1__lock

srwxrwxrwx 1 root root 0 Feb 2 18:00 sOHASD_IPC_SOCKET_11

srwxrwxrwx 1 root root 0 Feb 2 18:00 sOHASD_UI_SOCKET

srwxrwxrwx 1 grid oinstall 0 Feb 2 18:00 sOracle_CSS_LclLstnr_eotcs_1

-rw-r–r– 1 grid oinstall 0 Feb 2 18:00 sOracle_CSS_LclLstnr_eotcs_1_lock

srwxrwxrwx 1 root root 0 Feb 2 18:01 sora_crsqs

srwxrwxrwx 1 root root 0 Feb 2 18:00 sprocr_local_conn_0_PROC

srwxrwxrwx 1 root root 0 Feb 2 18:00 sprocr_local_conn_0_PROL

srwxrwxrwx 1 grid oinstall 0 Feb 2 18:00 sSYSTEM.evm.acceptor.auth

In Oracle Restart environment: And below is an example output from Oracle Restart environment:

drwxrwxrwt 2 root oinstall 4096 Feb 2 21:25 .oracle

./.oracle:

srwxrwx— 1 grid oinstall 0 Aug 1 17:23 master_diskmon

prw-r–r– 1 grid oinstall 0 Oct 31 2009 npohasd

srwxrwxrwx 1 grid oinstall 0 Aug 1 17:23 s#14478.1

srwxrwxrwx 1 grid oinstall 0 Aug 1 17:23 s#14478.2

srwxrwxrwx 1 grid oinstall 0 Jul 14 08:02 s#2266.1

srwxrwxrwx 1 grid oinstall 0 Jul 14 08:02 s#2266.2

srwxrwxrwx 1 grid oinstall 0 Jul 7 10:59 s#2269.1

srwxrwxrwx 1 grid oinstall 0 Jul 7 10:59 s#2269.2

srwxrwxrwx 1 grid oinstall 0 Jul 31 22:10 s#2313.1

srwxrwxrwx 1 grid oinstall 0 Jul 31 22:10 s#2313.2

srwxrwxrwx 1 grid oinstall 0 Jun 29 21:58 s#2851.1

srwxrwxrwx 1 grid oinstall 0 Jun 29 21:58 s#2851.2

srwxrwxrwx 1 grid oinstall 0 Aug 1 17:23 sCRSD_UI_SOCKET

srwxrwxrwx 1 grid oinstall 0 Aug 1 17:23 srac1DBG_CSSD

srwxrwxrwx 1 grid oinstall 0 Aug 1 17:23 srac1DBG_OHASD

srwxrwxrwx 1 grid oinstall 0 Aug 1 17:23 sEXTPROC1521

srwxrwxrwx 1 grid oinstall 0 Aug 1 17:23 sOCSSD_LL_rac1_

srwxrwxrwx 1 grid oinstall 0 Aug 1 17:23 sOCSSD_LL_rac1_localhost

-rw-r–r– 1 grid oinstall 0 Aug 1 17:23 sOCSSD_LL_rac1_localhost_lock

-rw-r–r– 1 grid oinstall 0 Aug 1 17:23 sOCSSD_LL_rac1__lock

srwxrwxrwx 1 grid oinstall 0 Aug 1 17:23 sOHASD_IPC_SOCKET_11

srwxrwxrwx 1 grid oinstall 0 Aug 1 17:23 sOHASD_UI_SOCKET

srwxrwxrwx 1 grid oinstall 0 Aug 1 17:23 sgrid_CSS_LclLstnr_localhost_1

-rw-r–r– 1 grid oinstall 0 Aug 1 17:23 sgrid_CSS_LclLstnr_localhost_1_lock

srwxrwxrwx 1 grid oinstall 0 Aug 1 17:23 sprocr_local_conn_0_PROL

Diagnostic file collection

If the issue can’t be identified with the note, as root, please run $GRID_HOME/bin/diagcollection.sh on all nodes, and upload all .gz files it generated in current directory.

References – 11gR2 Clusterware and Grid Home – What You Need to Know

– Troubleshooting 11.2 Grid Infastructure Installation Root.sh Issues

– How to Validate Network and Name Resolution Setup for the Clusterware and RAC

– What to Do if 11gR2 Clusterware is Unhealthy

– How to Proceed from Failed 11gR2 Grid Infrastructure (CRS) Installation

– How to Proceed from Failed Upgrade to 11gR2 Grid Infrastructure on Linux/Unix

附件

(49.64 KB) (78.58 KB) 相关的

产品 Oracle Database Products > Oracle Database > Oracle Database > Oracle Server – Enterprise Edition 关键字 ASM; CLUSTERWARE; CRS; GRID INFRASTRUCTURE; OCR; OCSSD; SERVICES; VOTING DISKS

返回页首 h1::respawn:/etc/init.d/init.ohasd run >/dev/null 2>&1Above example shows CRS suppose to run at run level 3 and 5; please note depend on platform, CRS comes up at different run level.To find out current run level:”init.ohasd run” is upOn Linux/UNIX, as “init.ohasd run” is configured in /etc/inittab, process init (pid 1, /sbin/init on Linux, Solaris and hp-ux, /usr/sbin/init on AIX) will start and respawn “init.ohasd run” if it fails. Without “init.ohasd run” up and running, ohasd.bin will not start:If any rc Snnscript. (located in rcn.d, example S98gcstartup) stuck, init process may not start “/etc/init.d/init.ohasd run”; please engage OS vendor to find out why relevant Snnscript. stuck.Cluserware auto start is enabled – its enabled by defaultBy default CRS is enabled for auto start upon node reboot, to enable:To verify whether its currently enabled or not:SCRBASE is /etc/oracle/scls_scr on Linux and AIX, /var/opt/oracle/scls_scr on hp-ux and SolarisNote: NEVER EDIT THE FILE MANUALLY, use “crsctl enable/disable crs” command instead.syslogd is up and OS is able to execute init script. S96ohasdOS may stuck with some other Snn script. while node is coming up, thus never get chance to execute S96ohasd; if that’s the case, following message will not be in OS messages:If you don’t see above message, the other possibility is syslogd(/usr/sbin/syslogd) is not fully up. Grid may fail to come up in that case as well. This may not apply to AIX.To find out whether OS is able to execute S96ohasd while node is coming up, modify ohasd:From:To:After a node reboot, if you don’t see /tmp/ohasd.start.timestamp get created, it means OS stuck with some other Snn script. If you do see /tmp/ohasd.start.timestamp but not “Oracle HA daemon is enabled for autostart” in messages, likely syslogd is not fully up. For both case, you will need engage System Administrator to find out the issue on OS level. For latter case, the workaround is to “sleep” for about 2 minutes, modify ohasd:From:To:File System that GRID_HOME resides is online when init script. S96ohasd is executed; once S96ohasd is executed, following message should be in OS messages file:If you see the first line, but not the last line, likely the filesystem containing the GRID_HOME was not online while S96ohasd is executed.Oracle Local Registry (OLR, $GRID_HOME/cdata/${HOSTNAME}.olr) is accessible and validIf the OLR is inaccessible or corrupted, likely ohasd.log will have similar messages like following:ORORORORThe solution is to restore a good backup of OLR with “ocrconfig -local -restore

To start an offline daemon – if ora.crsd is OFFLINE:As ohasd.bin is responsible to start up all other cluserware processes directly or indirectly, it needs to start up properly for the rest of the stack to come up. If ohasd.bin is not up, when checking it’s status, CRS-4639 (Could not contact Oracle High Availability Services) will be reported; and if ohasd.bin is already up, CRS-4640 will be reported if another start up attempt is made.Automatic ohasd.bin start up depends on the following:OS is at appropriate run level:OS need to be at specified run level before CRS will try to start up.To find out at which run level the clusterware needs to come up:

키워드에 대한 정보 crs-5802 unable to start the agent process

다음은 Bing에서 crs-5802 unable to start the agent process 주제에 대한 검색 결과입니다. 필요한 경우 더 읽을 수 있습니다.

이 기사는 인터넷의 다양한 출처에서 편집되었습니다. 이 기사가 유용했기를 바랍니다. 이 기사가 유용하다고 생각되면 공유하십시오. 매우 감사합니다!

사람들이 주제에 대해 자주 검색하는 키워드 How To Fix (Error-1053)The Service Did Not Respond To The Start Or Control Request || Windows 10/8/7

  • How To
  • fix
  • solve
  • error 1053
  • error code 1053
  • the service did not respond to the start
  • or
  • control request in a timely fashion
  • fix the service did not respond to the start error windows 10/8/7
  • windows could not start error 1053
  • local computer
  • the service did not respond to the start or control request in
  • a timely fashion
  • 1053 error

How #To #Fix #(Error-1053)The #Service #Did #Not #Respond #To #The #Start #Or #Control #Request #|| #Windows #10/8/7


YouTube에서 crs-5802 unable to start the agent process 주제의 다른 동영상 보기

주제에 대한 기사를 시청해 주셔서 감사합니다 How To Fix (Error-1053)The Service Did Not Respond To The Start Or Control Request || Windows 10/8/7 | crs-5802 unable to start the agent process, 이 기사가 유용하다고 생각되면 공유하십시오, 매우 감사합니다.

Leave a Comment