Quantcast
Channel: Symantec Connect
Viewing all articles
Browse latest Browse all 22530

Netbackup backups failing with error 82 for a media server

$
0
0
I need a solution

Hi All..

 

We have 1 master sever and 4 media server, backups are running fine for 3 media servers but for one media server all backups are failing with Error 82 

Active monitor logs

May 31, 2013 12:01:15 PM - requesting resource cvgsolbkpp001-tld-2-cbtsedl2b May 31, 2013 12:01:15 PM - requesting resource nasolbkp200.NBU_CLIENT.MAXJOBS.cvgrhesosp004-bka

May 31, 2013 12:01:15 PM - requesting resource nasolbkp200.NBU_POLICY.MAXJOBS.fs.all.prd.200

May 31, 2013 12:01:52 PM - granted resource  nasolbkp200.NBU_CLIENT.MAXJOBS.cvgrhesosp004-bka

May 31, 2013 12:01:52 PM - granted resource  nasolbkp200.NBU_POLICY.MAXJOBS.fs.all.prd.200

May 31, 2013 12:01:52 PM - granted resource  V25082 May 31, 2013 12:01:52 PM - granted resource  CBTSEDL2B_DR010 May 31, 2013 12:01:52 PM - granted resource  cvgsolbkpp001-tld-2-cbtsedl2b May 31, 2013 12:01:52 PM - estimated 72281241 kbytes needed May 31, 2013 12:01:55 PM - started process bpbrm (pid=20363) May 31, 2013 12:01:56 PM - connecting May 31, 2013 12:01:57 PM - connected; connect time: 0:00:00 May 31, 2013 12:01:59 PM - mounting V25082 May 31, 2013 12:02:09 PM - mounted V25082; mount time: 0:00:10 May 31, 2013 12:02:09 PM - positioning V25082 to file 1 May 31, 2013 12:02:09 PM - positioned V25082; position time: 0:00:00 May 31, 2013 12:02:09 PM - begin writing May 31, 2013 12:02:39 PM - Info bpbrm (pid=20363) from client cvgrhesosp004-bka: TRV - /opt/app is in a different file system from /opt. Skipping.

May 31, 2013 12:10:44 PM - Error bptm (pid=20379) media manager terminated by parent process May 31, 2013 12:10:46 PM - end writing; write time: 0:08:37 media manager killed by signal  (82)

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

After deep analysis of BPTM,BPBRM,BPCD,BPBKAR,NBSU logs  from symantech i got an update to upgrade kernel in the server..we have same kernel in all our servers , y do we need to upgrade this alone

Below is the advise from symantech do we really need to upgrade kernal for this issue

 I was able to download the logs (bptm and bpbrm ) and we can clearly see Socket errors in the bptm log

 

Bptm shows the socket errors below

*********************************************

 

main: Setting mud from bp.conf

11:30:07.817 [16777] <2> nbjm_media_request: Passing job control to NBJM, type WRITE/9

11:30:07.817 [16777] <2> nbjm_media_request: old_media_id = , media_id = NULL

11:30:07.818 [16777] <2> RequestInitialResources: starting

11:30:07.818 [16777] <2> RequestInitialResources: started

11:30:07.821 [16777] <2> Orb::init: Created anon service name: NB_16777_-216338603(Orb.cpp:630)

11:30:07.821 [16777] <2> Orb::init: endpointvalue is : pbxiop://1556:NB_16777_-216338603(Orb.cpp:648)

11:30:07.821 [16777] <2> Orb::init: initializing ORB Default_DAEMON_Orb with: Unknown -ORBSvcConfDirective "-ORBDottedDecimalAddresses 0" -ORBSvcConfDirective "static PBXIOP_Factory '-enable_keepalive'" -ORBSvcConfDirective "static EndpointSelectorFactory ''" -ORBSvcConfDirective "static Resource_Factory '-ORBProtocolFactory PBXIOP_Factory'" -ORBSvcConfDirective "static Resource_Factory '-ORBProtocolFactory IIOP_Factory'" -ORBSvcConfDirective "static PBXIOP_Evaluator_Factory '-orb Default_DAEMON_Orb'" -ORBSvcConfDirective "static Resource_Factory '-ORBConnectionCacheMax 1024 '" -ORBEndpoint pbxiop://1556:NB_16777_-216338603 -ORBSvcConf /dev/null -ORBSvcConfDirective "static Server_Strategy_Factory '-ORBMaxRecvGIOPPayloadSize 268435456'"(Orb.cpp:759)

11:30:07.846 [16777] <32> Orb::activate: Failed to initialize ORB: check to see if PBX is running or if service has permissions to connect to PBX. Check PBX logs for details

11:30:07.859 [16777] <8> Orb::init: CORBA exception: system exception, ID 'IDL:omg.org/CORBA/BAD_PARAM:1.0'

TAO exception, minor code = 5 (endpoint initialization failure in Acceptor Registry; ECONNREFUSED), completed = NO  during orb activation

11:30:07.860 [16777] <16> initializeJmComm: RequestInitialResources : failed to initialize ORB: [BAD_PARAM]. Verify PBX is running and caller has permissions to connect to PBX. See PBX logs for details

11:30:07.860 [16777] <2> RequestInitialResources: retVal = 25    emmStatus = 3000000

11:30:07.860 [16777] <2> RequestInitialResources: returning

11:30:07.860 [16777] <4> nbjm_media_request: Error from RequestMultipleResources, Master nasolbkp200, error 25, resourceAllocated 0

11:30:07.861 [16777] <2> set_job_details: Tfile (1468172): LOG 1370014207 16 bptm 16777 nbjm_media_request() failed: 25, cannot continue with copy 1

 

 

The media server has Solaris 10 but very old  OS patch

5.10 Generic_142900-03 sun4v sparc SUNW,SPARC-Enterprise-T5220

 

You will need to update the OS patch due to socket errors showing up in bptm

 

 

Due to the socket related error noted in BPTM for 2 different master daemon connection failures

 

Do the following—

 

 

1)

Solaris servers needs SUN KERNEL update June 2011 or later Note the following kernel version on the problem media server / client

 

# uname -a

         142900-03   =  Release Date:    Dec/09/2009

 

 

Oracle / Sun and SYMANTEC identified minimum Solaris Kernel patch to avoid system Socket Management issues and NetBackup daemon issues using the server sockets.

Symantec recommends that you download the patch set dated June 2011 (or newer) from the Oracle Support website.

https://support.oracle.com

 

The patch set contains the following minimum recommended patches:

¦ 118777-17 (SunOS 5.10: Sun GigaSwift Ethernet 1.0 driver patch) ¦ 139555-08 (Kernel patch with C++ library updates).

¦ 142394-01 (Internet Control Message Protocol (ICMP) patch) ¦ 143513-02 (Data Link Admincommandfor Solaris (DLADM) patch) ¦ 141562-02 (Address Resolution Protocol (ARP) patch)

 

The following patches are recommended for Solaris 10 SPARC with NXGE cards:

¦ 142909-17 (SunOS 5.10: nxge patch)

¦ 143897-03 (Distributed Link Software patch) ¦ 143135-03 (Aggregation patch) ¦ 119963-21 (Change Request ID - 6815915) ¦ 139555-08 (Change Request ID - 6723423)

 

Reference

Solaris nxge driver and NetBackup communication errors

http://www.symantec.com/business/support/index?page=content&id=TECH128953

 

Reference

SUN BUG 119963-21 - SunOS 5.10: Shared library patch for C++ This bug describes a lock contention in dtrace area when frequently forking or exiting short-living processes that uses C++ runtime library libCrun.so. Symptoms are a high load in kernel and a delay in fork() and exit() system call.

 

NBU support has now started to identify this problem for NBU in latest release info— Reference NetBackup 7.5 Release Notes Page 65 Solaris Patches

http://www.symantec.com/docs/DOC5041

 

 

 

 

2)

Reduce  Media Server socket usage

 

Move NBU internal VNETD socket connections on  servers to server loopback interface instead of using VNETD daemon --Add the following line to /usr/openv/netbackup/bp.conf CONNECT_OPTIONS = localhost 1 0 2

 

No restart needed


Viewing all articles
Browse latest Browse all 22530

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>