| Oracle® Database Backup and Recovery Advanced User's Guide 10g Release 1 (10.1) Part Number B10734-01 |
|
|
View PDF |
This chapter describes how to troubleshoot Recovery Manage r. This chapter contains these topics:
Recover y Manager provides detailed error messages that can aid in troubleshooting problems. Also, the Oracle database server and third-party media vendors generate useful debugging output of their own. The discussion which follows explains how to identify and interpret the different errors you may encounter.
Outp ut that is useful for troubleshooting failed or hung RMAN jobs is located in several different places, as explained in the following table.
| a> Type of Output | Produced By< /strong> | Location | Description |
|---|---|---|---|
|
RMAN messages |
RMAN |
Completed job information is
in When running RMAN from the command line, you can direct output to the following places: |
<
td class="Informal">
|
|
|
Oracle database server |
The directory named in the |
Contains a chronological log of errors, initialization parameter settings, and administration operations. Recor ds values for overwritten control file records (refer to Oracle Data Guard Concepts and Administration). |
|
Oracle trace file |
Oracle database server |
The directory specified in the |
Contains detaile
d output generated by Oracle server processes. This file is created when an |
|
Third-party media management software |
<
a name="1006197">
The directory specified in the |
|
Media manager log file |
Third-party media management software |
The filenames for any
media manager logs other than |
<
a name="1006207">
Contains information on the functioning of the media management device. |
| See Also:
Oracle Database Er
ror Messages for explanations of |
Table 15-1 indic ates the error ranges for common RMAN error messages, all of which are described in Oracle Database Error Messages.
In the event of a media manager error, ORA-19511 is signalled, and the media manager is expected to provide RMAN a descriptive error. RMAN will display the error passed ba ck to it by the media manager. For example, you might see this:
ORA-19511: Error received f rom media manager layer, error text: sbtpvt_open_input: file .* does not exist or cannot be accessed, errno = 2
The message from the media manager should provide you with en
ough information to let you fix the root problem. If it does not, you should refer to the documentation for your media manager or con
tact your media management vendor support representative for further information. ORA-19511 errors originate with the me
dia manager, not the Oracle database. The database merely passes the message on from the media manager. The cause can only be address
ed by the media management vendor.
Note that if you are still using an SBT 1.1-compliant me dia management layer, you may see some additional error message text. Output from an SBT 1.1-compliant media management layer is simi lar to the following:
ORA-19507: failed to retrieve sequential file, handle="c-140148591-20 031014-06", parms="" ORA-27007: failed to open file Additional information: 7000 Additional information: 2 ORA-19511: Error received from media manager layer, error text: SBT error = 7000, errno = 0, sbtopen: backup file not found
The "Additional information" provided uses error codes specific to SBT 1.1. The values displayed correspond to the medi
a manager message numbers and error text listed in Table 15-2. RMAN re-signals the error, as
an ORA-19511 Error received from media manager layer error, and a general error message related to the error code return
ed from the media manager and including the SBT 1.1 error number is then displayed.
The SBT
1.1 error messages are listed here for your reference. Table 15-2 lists media manager messag
e numbers and their corresponding error text. In the error codes, O/S stands for operating s
ystem. The errors prefixed with an asterisk are internal and should not typically be seen during normal operation.
| Cause | No. | Message |
|---|---|---|
|
sbtopen |
7003 < /a> 7012* |
Backup file not found (only returned for read) < /p> File exists (only returned for write) Device found, but busy; try again later Can't connect with Media Manager O/S error for example malloc, fork error Invalid argument(s) to sbtopen td> |
|
sbtclose |
7024* a> 7025 |
Invalid file handle or file not open I/O error < /a> < p class="TB">Can't connect with Media Manager |
|
sbtwrite |
7044* |
Invalid file handle or file not open I/O error < /a> Invalid argument(s) to sbtwrite |
|
sbtread |
7065* |
Invalid file handle or file not open Invalid argument(s) to sbtread p> |
|
sbtremove |
7086* |
Can't connect with Media Manager Invalid argument(s) to sbtremove< /p> |
|
sbtinfo |
7094 < /a> 7095* |
Can't connect with Media Manager Invalid argument(s) to sbtinfo |
|
sbtinit |
7111 |
Invalid argument(s) to sbtinit O/S error |
Some times you may find it difficult to identify the useful messages in the RMAN error stack. Note the following tips and suggestions:
Additional information:" numeric error codes, look for the ORA-19511 message that follows for
the text of error messages passed back to RMAN by the media manager. These should identify the real failure in the media management
layer.RMAN-03002 or RMAN-03009 message
(RMAN-03009 is the same as RMAN-03002 but includes the channel ID), immediately following the error banner
. These messages indicate which command failed. Syntax errors generate RMAN-00558.You attempt a backup of tablespace users and receive the following message:
Starting backup at 29-AUG-02 using channel ORA_DISK_1 RMAN-00571 : =========================================================== RMAN-00569: =============== ERROR MESSAGE STACK F OLLOWS =============== RMAN-00571: =========================================================== RMAN-03002: failure of backup command at 08/29/2002 15:14:03 RMAN-20202: tablespace not found in the reco very catalog RMAN-06019: could not translate tablespace name "USESR"
The RMAN-03002 error indicates that the BACKUP command failed. You read the last
two messages in the stack first and immediately see the problem: no tablespace usesr appears in the recovery catalog bec
ause you mistyped the name.
Assume that y ou attempt to recover a tablespace and receive the following errors:
RMAN> RECOVER TABLE SPACE users; Starting recover at 29-AUG-01 using channel ORA_DISK_ 1 starting media recovery media recovery failed RMAN-00571: =========================================================== RMAN-00569: =============== ERROR M ESSAGE STACK FOLLOWS =============== RMAN-00571: =========================================================== RMAN-03002: failure of recover command at 08/29/2001 15:18:43 RMAN-11003: failure during pa rse/execution of SQL statement: alter database recover if needed tablespace USERS ORA-00283: recovery session canceled due to errors ORA-01124: cannot recover data file 8 - file is in use or recovery ORA-01110: data file 8: '/oracle/oradata/trgt/users01.dbf'
As su
ggested, you start reading from the bottom up. The ORA-01110 message explains there was a problem with the recovery of d
atafile users01.dbf. The second error indicates that the database cannot recover the datafile because it is in use or al
ready being recovered. The remaining RMAN errors indicate that the recovery session was cancelled due to the server errors. Hence, yo
u conclude that because you were not already recovering this datafile, the problem must be that the datafile is online and you need t
o take it offline and restore a backup.
Assume that you use a tape drive and receive the following output during a backup job:
RMAN-00571: =========================================================== RMAN-00569: ============== = ERROR MESSAGE STACK FOLLOWS =============== RMAN-00571: ===================================================== ====== ORA-19624: operation failed, retry possible ORA-19507: failed to retrieve sequenti al file, handle="/tmp/foo", parms="" ORA-27029: skgfrtrv: sbtrestore returned error ORA-1 9511: Error received from media manager layer, error text: sbtpvt_open_input:file /tmp/foo does not exist or cannot be accessed, errno=2
The error text displayed following th
e ORA-19511 error is generated by the media manager and describes the real source of the failure. Refer to the media man
ager documentation to interpret this error.
Assume that you use a tape drive and receive the following output during a backup job:
RMAN-00571: =========================================================== RMAN-00569: ========== ===== ERROR MESSAGE STACK FOLLOWS =============== RMAN-00571: ================================================= ========== RMAN-03009: failure of backup command on c1 channel at 09/04/2001 13:18:19 ORA -19506: failed to create sequential file, name="07d36ecp_1_1", parms="" ORA-27007: failed to open file SVR4 Error: 2: No such file or directory Additional information: 7005 Addit ional information: 1 ORA-19511: Error received from media manager layer, error text: S BT error = 7005, errno = 2, sbtopen: system error
The main inform ation of interest returned by SBT 1.1 media managers is the error code in the "Additional information" line:
Additional information: 7005
Referring to Table 15-2, "Media Manager Error Message Ranges", you discover that error 7005 means
that the media management device is busy. So, the media management software is not able to write to the device because it is in use o
r there is a problem with it.
One way to determine whether RMAN encountered an error is to examine its return code or exit status. The RMAN client returns 0 to the shell from which it was invoked if no errors occurred, and a nonzero error value otherwise.
How you access this return code depends upon the environment from which you invoked the RMAN client. For example, if you are
running UNIX with the C shell, then, when RMAN completes, the return code is placed in a shell variable called $status.
The method of returning exit status is a detail specific to the host operating system rather than the RMAN client.
On some platforms, Oracle provides a d
iagnostic tool called sbttest. This utility performs a simple test of the media management software by attempting to com
municate with the media manager as the Oracle database server would.
On UNIX, the sbttest utility is typically located in $ORACLE_HOME/bin. If for some re
ason the utility is not included with your platform, then contact Oracle Support to obtain the C version of the program. You can comp
ile this version of the program on all UNIX platforms.
Note that on platforms such as Solar
is, you do not have to relink when using sbttest. On other platforms, relinking may be necessary.
For online documentation of sbttest, issue the following on the command line:
% sbttest
The program displays the list of possible arguments for the program:
Error: backup file name must be specified Usage: sbttest backup_file_name # this is the only requir ed parameter <-dbname database_name> <-trace trace _file_name> <-remove_before> <-no_remove_after& gt; <-read_only> <-no_regular_backup_restore> <-no_proxy_backup> <-no_proxy_restore> <-file_type n> <-copy_number n> <-media_pool n> <-os_res_size n> <-pl_res_size n> <-block_size block_size> < -block_count block_count> <-proxy_file os_file_name bk_file_name [os_res_size pl_res_size block_size block_count]> <-libname sbt_l ibrary_name>
The display also indicates the meaning of each ar gument. For example, following is the description for two optional parameters:
Optional par ameters: -dbname specifies the database name which will be used by SBT to i dentify the backup file. The default is "sbtdb" -trace specifies the name of a file where the Media Managem ent software will write diagnostic messages.
Use sbttest to perform a quick test of the media manager. The following table explains ho
w to interpret the output.
To use sbttest:
sbttest at the command line:
% sbttest
If the program is operational, then you should see a display of the online docum entation.
some_file.f and write the output
to sbtio.log:
% sbttest some_file.f -trace sbtio.logYou can also test a backup of an existing datafile. For example, this command tests datafi le
tbs_33.fof databaseprod:% sbttest tbs_33.f -dbname prod
libobk.so could not be loaded. Check that it is installed properly, and that LD_LIBRARY_PATH envir onment variable (or its equivalent on your platform) includes the directory where this file can be found. Here is some additional i nformation on the cause of this error: ld.so.1: sbttest: fatal: libobk.so: open failed: No such file or directo ry
Note that in some cases sbttest can wor
k but an RMAN backup does not. The reasons can be the following:
sbttest is not the owner of the Oracle processes.sbttest may still work.sbttest program passes all environment parameters from the shell but RMAN does not.There are several ways to terminate an RMAN command in the middle of execution:
ALTER SYSTEM KILL SESSION statement.You can identify the Oracle session ID for an RMA N channel by looking in the RMAN log for messages with the format shown in the following example:
channel ch1: sid=15 devtype=SBT_TAPE
The sid
and devtype are displayed for each allocated channel. Note that the Oracle sid is different from the operat
ing system process ID. You can kill the session using a SQL ALTER SYSTEM KILL SESSION statement.
ALTER SYSTEM KILL SESSION takes two arguments, the sid
code> printed in the RMAN message and a serial number, both of which can be obtained by querying V$SESSION. For example,
run the following statement, where sid_in_rman_output is the number from the RMAN message:
SELECT SERIAL# FROM V$SESSION WHERE SID=sid_in_rman_output;
Then, run the following statement, substituting the sid_in_rman_output and serial nu
mber obtained from the query:
ALTER SYSTEM KILL SESSION 'sid_in_rman_output,serial#';
Note that this will not unha ng the session if the session is hung in media manager code..
Finding and killing the processes that are associated with the server sessions is operating sy stem specific. On some platforms the server sessions are not associated with any processes at all. Refer to your operating system spe cific documentation for more information.
You may sometimes need to kill an RMAN job that is hung in the media manager. The best way to terminate RM AN when the channel connections are hung in the media manager is to kill the session in the media manager. If this action does not so lve the problem, then on some platforms, such as Unix, you may be able to kill the Oracle processes of the connections. (Note that ki lling the Oracle processes may cause problems from the media manager. See your media manager documentation for details.)
The nature of an RMAN session depends on the operating syst em. In UNIX, an RMAN session has the following processes associated with it:
DUPLICATE or TSPITR operationsALLOCATE CHANNEL or CONFIGURE CHANNEL
code> commands. One polling connection exists for each distinct connect string used in the ALLOCATE CHANNEL
or CONFIGURE CHANNEL command.RMAN usually hangs because one of the channel connections is waiting in the media manager code for a tape resour ce. The catalog connection and the default channel appear to hang, because they are waiting for RMAN to tell them what to do. Polling connections seem to be in an infinite loop while polling the RPC under the control of the RMAN process.
If you kill the RMAN process itself, then you also kill the catalog connection, the auxiliary connection, the default c hannel, and the polling connections. If target and auxiliary connections are not hung in the media manager code, they also terminate. If either the target connection or any of the auxiliary connections are executing in the media management layer, they will not termi nate until the processes are manually killed at the operating system level.
Not all media m anagers can detect the termination of the Oracle process. Those which cannot may keep resources busy or continue processing. Consult your media manager documentation for details.
Terminating the catalog connection does not c ause the RMAN process to terminate because RMAN is not performing catalog operations while the backup or restore is in progress. Remo ving default channel and polling connections causes the RMAN process to detect that one of the channels has died and then proceed to exit. In this case, the connections to the hung channels remain active as described previously.
Once the hung channels in the media manager code are killed, the RMAN pro cess detects this termination and proceed to exit, removing all connections except target connections that are still operative in the media management layer. The caveat about the media manager resources still applies in this case.
To terminate an Oracle process that is hung in the media manager:
V$SESSION and V$SESSION_WAIT as described in
"Monitoring RMAN Through V$ Views". For example, execute the f
ollowing query:
COLUMN EVENT FORMAT a10 COLUMN SECONDS_IN_WAIT FORMAT 9 99 COLUMN STATE FORMAT a20 COLUMN CLIENT_INFO FORMAT a30 SELECT p.SPID, EVENT, SECONDS_IN_WAIT AS SEC_WAIT, STATE, CLIENT_INFO a>FROM V$SESSION_WAIT sw, V$SESSION s, V$PROCESS p WHERE sw.EVENT LIKE 'sbt%' AND s.SID=sw.SID AND s.PADDR=p.ADDR ;
Examine the SQL output to determine which sbt functions are waiting. For example, the output may be as follows:< /p>
SPID EVENT SEC_WAIT STATE CLIENT_INFO ---- --- ------- ---------- -------------------- ------------- 8642 sbtwrite2 600 WAITING rman chan nel=ORA_SBT_TAPE_1 8374 sbtwrite2 600 WAITING rman channel=ORA_SBT_TAPE_2
kill -9 command:
% kill -9 8642 8374
On Windows, there is a comm
and-line utility called ORAKILL which lets you kill a specific thread in this situation. From a command prompt, run the
following command:
orakill sid thread_id < a name="1011415">
where sid identifies the database instance
to target, and the thread_id is the SPID value from the query in step 1.
| See Also:
Your operating system specific documentation for the relevant commands |
|
See Also: Oracle Database Recovery Manager Reference for descriptions of the legal |
In this scenario, an RMAN backup job starts as normal and then pauses inexplicably:
Recovery Manager: Release 10 .1.0.2.0 - Production Copyright (c) 1995, 2003, Oracle. All rights reserved. connected to target database: TRGT connected to recovery catalog database RMAN> BACKUP TABLESPACE SYSTEM, tools; all ocated channel: t1 channel t1: sid=16 devtype=SBT_TAPE channel t1: starting datafile backupset set_count=15 set_stamp=338309600 channel t1: including dataf ile 2 in backupset channel t1: including datafile 1 in backupset channel t1: including cu rrent controlfile in backupset # Hanging here for 30 minutes now
If a backup job is hanging, that is, not proceeding, then several scenarios are possibl e:
Query sbt wait events to gain more information. For example, run the
following query on the target instance:
COLUMN EVENT FORMAT a10 COLU MN SECONDS_IN_WAIT FORMAT 999 COLUMN STATE FORMAT a20 COLUMN CLIENT_INFO FORMAT a30 SELECT p.SPID, EVENT, SECONDS_IN_WAIT AS SEC_WAIT, STATE, CLIEN T_INFO FROM V$SESSION_WAIT sw, V$SESSION s, V$PROCESS p WHERE sw.EVENT LIKE 'sbt%' AND s.SID=sw.SID AND s.PADDR=p.ADDR ;
Examine the SQL output to determine which sbt functions are waiting. For example, the o utput may be as follows:
SPID EVENT SEC_WAIT STATE CLIENT_INFO ---- ---------- ---------- -------------------- ------------------------------ 8642 sbtbackup 1500 WAITING rman channel=ORA_SBT_TAPE_1
Because the causes of a hung backup job can be varied, so are the solutions. For example, backup jobs often hang si
mply because the tape device has completely filled the current cassette and is waiting for a new tape to be inserted. Ideally, the qu
ery of the sbt wait events should indicate the problem.
In this example, a sin gle sbtbackup has taken 1500 seconds, so RMAN is waiting on the media manager to finish its write operation. Check that the media man ager is functioning normally, and contact the media management vendor's technical support for assistance.
If the sbt wait event query is unhelpful, then examine media manager process, log, and trace files for si
gns of abnormal termination or other errors (refer to the description of message files in "Identifying
Types of Message Output").
|
See Also: "Terminating an RMAN Session: Basic Steps" to learn how to kill an RMAN session that is hanging |
In this scenario, yo u run a backup job and receive message output similar to the following:
channel c8: includi ng datafile number 47 in backupset RPC call appears to have failed to start on channel c9 RPC call ok on channel c9 channel c3: including datafile number 18 in backupset
The RPC call appears have failed message does not usually indicate a problem. The message indicates one of the follo
wing:
Timing problems occur
in this way. When RMAN begins an RPC, it checks the V$SESSION performance view. The RPC updates the information in the
view to indicate when it starts and finishes. Sometimes RMAN checks V$SESSION before the RPC has indicated it has starte
d, which in turn generates the following message:
RPC call appears to have failed
If a message stating "RPC call ok"
does not appear in the output immediately following the message stating "RPC call appears have failed", then the backup job encountered an internal problem. Contact Oracle Support for f
urther assistance.
In this sc enario, you attempt a backup and receive the following error messages:
RMAN-3014: Implicit resync of recovery catalog failed RMAN-6038: Recovery catalog package detected an error R MAN-20035: Invalid high RECID error
In one common scenario, you restore a backup control file created through a non-Oracle mechanism, and then open the databa
se without the RESETLOGS option. If you had created the backup control file through the RMAN BACKUP command
or the SQL ALTER DATABASE BACKUP CONTROLFILE statement, then the database would
have required you to reset the online logs.
The control file and the recovery catalog are n ow not synchronized. The database control file is older than the recovery catalog, because at one time the recovery catalog resynchro nized with the old current control file, and now the database is using a backup control file. RMAN detects that the control file curr ently in use is older than the control file previously used to resynchronize.
Another commo n scenario occurs when you attempt to copy the target database to a new machine as follows:
CATALOG to add this control file copy to the repository.RECID is 100, but the control file indicates that the highest RECID is 90. The control file RECID<
/code> should always be greater than or equal to the recovery catalog RECID, so RMAN issues RMAN-20035.This solution is safest and is strongly recommended. It preserves the control file, so that the historical information about the database stored in the contr ol file continues to be available after the procedure.
To reset the da tabase with RMAN:
% sqlplus '/ AS SYSDBA'
ALTER DATABASE MOUNT;
USING BACKUP CONTROLFILE clause stamps the controlfile as a backup, which then perm
its OPEN RESETLOGS. For example, enter:
ALTER DATABASE RECOVER DATA BASE UNTIL CANCEL USING BACKUP CONTROLFILE; ALTER DATABASE RECOVER CANCEL;
% rman TARGET SYS/oracle@trgt CATALOG rman/cat@catdb< /li>
RESETLOGS option. For example,
enter:
RMAN> ALTER DATABASE OPEN RESETLOGS;
BACKUP DATABASE PLUS ARCHIVELOG;
This solution is similar to the previous one, but does require that you re-create your control file. It is better-suited for the case in which you are copying your database to a second system, where you ma y not want to keep the history from the control file for the copy of the database on the second system, or where you might drop a few datafiles or change the online logs by editing your control file.
To create the control file with SQL*Plus:
% sqlplus 'SYS/oracle@trg t AS SYSDBA'
SQL> ALTER DATABASE MOUNT;
SQL> ALTER DATABASE BACKUP CONTROLFILE TO TRACE;
# The following commands will create a new control file and use it # to open the d atabase. # Data used by the recovery manager will be lost. Additional logs may # be requi red for media recovery of offline data files. Use this # only if the current version of all online logs are ava ilable. STARTUP NOMOUNT CREATE CONTROLFILE REUSE DATABASE "TRGT" NORESETLOGS ARCHIVELOG -- STANDBY DATABASE CLUSTER CONSISTENT AND UNPROTECTED MAXLOGFILES 32 MAXLOGMEMBERS 2 MAXDATAFILES 32 MAXINSTANCES 1 MAXLOGHISTORY 226 LOGFILE GROUP 1 '/oracle/oradata/trgt/redo01.log' SIZE 25M, GROUP 2 '/oracle/oradata/trgt/redo02.log' SIZE 25M, GROUP 3 '/oracle/oradata/trgt/redo03 .log' SIZE 500K -- STANDBY LOGFILE DATAFILE '/oracle/oradata/trg t/system01.dbf', '/oracle/oradata/trgt/undotbs01.dbf', '/oracle/oradata/trgt/cwmlite0 1.dbf', '/oracle/oradata/trgt/drsys01.dbf', '/oracle/oradata/trgt/example01.dbf', '/oracle/oradata/trgt/indx01.dbf', '/oracle/oradata/trgt/tools01.dbf', '/oracle/oradata/trgt/users01.dbf' CHARACTER SET WE8DEC ; # Take files offline to match current control file. ALTER DATABASE DATAFILE '/oracle/oradata/trgt/tools01.dbf' OF FLINE; ALTER DATABASE DATAFILE '/oracle/oradata/trgt/users01.dbf' OFFLINE; # Configure RM AN configuration record 1 VARIABLE RECNO NUMBER; EXECUTE :RECNO := SYS.DBMS_BACKUP_RESTOR E.SETCONFIG('CHANNEL','DEVICE TYPE DISK DEBUG 255'); # Recovery is required if any of the datafiles are restor ed backups, # or if the last shutdown was not normal or immediate. RECOVER DATABASE # All logs need archiving and a log switch is needed. ALTER SYSTEM ARCHIVE LOG ALL; # Database can now be opened normally. ALTER DATABASE OPEN; # Commands to add tempfiles to temporary tablespaces. # Online tempfiles have complete space information. # Other tempfiles may require adjustment. ALTER TABLESPACE TEMP ADD TEMPFILE '/oracle/oradata/trgt/temp01.dbf' REUSE; # End of tempfile additions.
SHUTDOWN IMMEDIATEli>
STARTUP NOMOUNT CREATE CONTROLF ILE ...; EXECUTE ...; RECOVER DATABASE ALTER SYSTEM ARCHIVE LOG CUR RENT; ALTER DATABASE OPEN ...;
In this scenario, a backup job fails because RMAN cannot make a snapshot control file. The message stack is as follows:
RMAN-00571: =========================================================== RMA N-00569: =============== ERROR MESSAGE STACK FOLLOWS =============== RMAN-00571: ============================== ============================= RMAN-03002: failure of backup command at 08/30/2001 22:48:44 a>ORA-00230: operation disallowed: snapshot controlfile enqueue unavailable
When RMAN needs to back up or resynchronize from the control file, it first creates a snapshot or consistent image of the control file. If one RMAN job is already backing up the control file while another needs to create a new snapshot control file, then you may see the following message:
< a name="1007454">waiting for snapshot controlfile enqueue
Und er normal circumstances, a job that must wait for the control file enqueue waits for a brief interval and then successfully obtains t he enqueue. RMAN makes up to five attempts to get the enqueue and then fails the job. The conflict is usually caused when two jobs ar e both backing up the control file, and the job that first starts backing up the control file waits for service from the media manage r.
To determine which job is holding the conflicting enqueue:
waiting for snapshot controlfile enqueue", star
t a new SQL*Plus session on the target database:
% sqlplus 'SYS/oracle@trgt AS SYSDBA'
SELECT s.SID, USERNAME AS "User", PROGRAM, MODULE, ACTION, LOGON_TIME "Logon", l.* FROM V$SESSION s, V$ENQUEUE_LOCK l WHERE l .SID = s.SID AND l.TYPE = 'CF' AND l.ID1 = 0 AND l.ID2 = 2;
You should see output similar to the following (the output in this exam ple has been truncated):
SID User Program Module Action Logon --- ---- -------------------- ------------------- ---------------- --------- < /a>9 SYS rman@h13 (TNS V1-V3) backup full datafile: c10000210 STARTED 21-JUN-01
Commonly, enqueue situations occur when a job is writing to a tape drive, but the tape drive is waiting for new tape to be inserted. If you start a new job in this situation, then you will probab ly receive the enqueue message because the first job cannot complete until the new tape is loaded.
After you have determined which job is creating the enqueue, you can do one of the following:
In this scenario, the database archives automatically to two directories: ORACLE_HOME
/oradata/trgt/arch and ORACLE_HOME/oradata/trgt/arch2. You tell RMAN to perfor
m a backup and delete the input archived redo logs afterward in the following script:
BACKU P ARCHIVELOG ALL DELETE INPUT;
You then run a crosscheck to make sure the logs are gone and find the following:
CROSSCHECK ARCHIVELOG ALL; validation succeeded for archived log archivelog filename=/oracle/oradata/trgt/arch 2/archive1_964.arc recid=19 stamp=368726072
RMAN deleted one set of logs but not the other.
This problem is not an error. When you specify DELETE INPUT without the ALL keyword, RMAN del
etes only one copy of each input log. Even if you archive to five destinations, RMAN deletes logs from only one directory.
To force RMAN to delete all existing a
rchived redo logs, use the DELETE ALL INPUT clause of the BACKUP command. For exa
mple, enter:
BACKUP ARCHIVELOG ALL DELETE ALL INPUT;
In this scenario, you schedule regular backups of the archived redo logs. The next time you make a backup, you receive this error:
RMAN-6 089: archive log NAME not found or out of sync with catalog
This problem occurs when the archived log that RMAN is looking for cannot be ac
cessed by RMAN, or the recovery catalog needs to be resynchronized. Often, this error occurs when you delete archived logs with an op
erating system command, which means that RMAN is unaware of the deletion. The RMAN-6089 error occurs because RMAN attemp
ts to back up a log that the repository indicates still exists.
Make sure that the archived logs exists in the specified directory and that the RMAN catalog is synchronized. Check the following:
RMAN-6089 error exists in the correct directory.= oracle, group = DBA) to make sure that RMAN can a
ccess the file.RESYNC CATALOG;
If you know that the logs are unavailable because you deleted them by using an operating system utility, then run the following command at the RMAN prompt to update RMAN metadata:
CROSSCHECK ARCHIVELOG ALL;
It is alway
s better to use RMAN to delete logs than to use an operating system utility. The easiest method to remove unwanted logs is to specify
the DELETE INPUT option when backing up archived logs. For example, enter:
BACKUP DEVICE TYPE sbt ARCHIVELOG ALL DELETE ALL INPUT;
In this scenario, you are connected t o the target database while it is not open and attempting to perform an RMAN operation. You receive the following error:
PLS-00553: character set name is not recognized< h4 class="H3">RMAN Does Not Recognize Character Set Name: Diagnosis< /h4>
Typically, this message means that the character set in the client environm ent, that is, the environment in which you are running the RMAN client, is different from the character set in the target database en vironment.
NLS_CHARACTERSET parameter. For
example, run this query:
SQL> SELECT VALUE FROM V$NLS_PARAMETERS WHERE PARAMETER='NL S_CHARACTERSET';
NLS_LANG
environment variable on a UNIX system as follows:
% setenv NLS_LANG american_america.we8dec % setenv NLS_DATE_FORMAT "MON DD YYYY HH24:MI:SS"
If the connection is made througfh a listener, then the listener must be started with the correct Globalization Supp ort settings. Otherwise, the spawned connections inherit the incorrect Globalization Support settings from the listener.
RMAN fails with ORA-01031 (insufficient pr ivileges) or ORA-01017 (invalid username/password) errors when trying to connect to the target database:
% rman Recovery Manager: Release 10.1.0.2.0 - Production Copyright (c) 1995, 2003, Oracle. All rights reserved. RMAN> CONNECT TARGET sys /mypass@inst1 RMAN-00571: =========================================================== RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS =============== RMAN-00571: ======= ==================================================== ORA-01031: insufficient privileges
RMAN automatically requests a connection to the t
arget database as SYSDBA. In order to connect to the target as SYSDBA, you must do one of the following:
DBA group with resp
ect to the target database (that is, have the ability to connect with SYSDBA privileges to the target database without a
password).orapwd command and th
e initialization parameter REMOTE_LOGIN_PASSWORDFILE.If the target database does not have a password file, then the user you are logged in as must be validated with operating system authentication.
Either create a password file for the target database or add yourself to the administrator list in the operating system.
| See Also:
Oracle Database Administrator's Guide to learn how to create a passwo rd file |
In this scenario, you attempt to duplicate a database with the DUPLICATE command, but receive the followi
ng error stack:
RMAN-00571: =========================================================== RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS =============== RMAN-00571: ======== =================================================== RMAN-03002: failure of Duplicate Db command at 09/04/2001 1 2:11:29 RMAN-03015: error occurred in stored script Memory Script RMAN-06053: unable to p erform media recovery because of missing log RMAN-06025: no backup of log thread 1 seq 16 scn 145858 found to r estore
The pr
oblem is that RMAN is not able to apply all the archived logs needed for complete recovery. For example, if you only backed up logs t
hrough sequence 15, but the most recent archived log is sequence 16, then DUPLICATE fails.
When creating the du
plication script, use the SET UNTIL command to specify a log sequence number for incomplete recovery. For e
xample, to terminate recovery after applying log sequence 15, enter:
RUN { SET UNTIL SEQUENCE 16 THREAD 1; # recovers up to but not including log 16 DUPLICATE TARGET DATABASE TO 'dupdb';
}
"Creating Duplicate of the Database at a Past Point in Time: Example" for more information about performing incomplete recovery during the duplication operation |
In this scenario, you back up the datab
ase, then run the DUPLICATE command. You receive the following error stack:
RM AN-00571: =========================================================== RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS =============== RMAN-00571: =========================================================== RMAN-03002: failure of Duplicate Db command at 09/04/2001 13:55:11 RMAN-03015: error occurred in stored script Memory Script RMAN-06026: some targets not found - aborting restore RMAN-06 023: no backup or copy of datafile 8 found to restore RMAN-06023: no backup or copy of datafile 7 found to rest ore RMAN-06023: no backup or copy of datafile 6 found to restore RMAN-06023: no backup or copy of datafile 5 found to restore RMAN-06023: no backup or copy of datafile 4 found to restore RMAN-06023: no backup or copy of datafile 3 found to restore RMAN-06023: no backup or copy of datafile 2 found to restore RMAN-06023: no backup or copy of datafile 1 found to restore
The DUPLICATE command recovers
to archived redo logs, but cannot recover into online redo logs. Thus, if the restored backup cannot be made consistent without appl
ying the online redo logs, then duplication fails with RMAN-06023 errors because RMAN is looking for backups created before the most recent archived log.
After backing up the source database, archive and back up the current redo log:
RMAN> SQL 'ALTER SYSTEM ARCHIVE LOG CURRENT'; RMAN> BACKUP ARCHIVELOG ALL;
This archives all records in the online redo logs so that RMAN can now recover the backup by applying the most recent archived redo log.
In this scenario, you list the database incarnations registered in the recovery catalog and
see a database with the name UNKNOWN:
LIST INCARNATION OF DATABASE; RMAN-03022: compiling command: list List of Database Incarnations < a name="1007644">DB Key Inc Key DB Name DB ID STATUS Reset SCN Reset Time ------- ------- ------- ------ ------ ---------- ---------- 56 57 TRGT 4052472287 CURRENT 1 Sep 03 2001 06:45:51 1 19 UNKNOWN 4141147584 PARENT 1 Jan 08 2001 14:47 :28 . . .
One way you get the DB_NAME of UNKNOWN is wh
en you register a database that was once opened with the RESETLOGS option. The DB_NAME can be changed durin
g a RESETLOGS operation, so RMAN does not know what the DB_NAME was for those old incarnations of the datab
ase because it was not registered in the recovery catalog at the time. Consequently, RMAN sets the DB_NAME column to DBINC record.
The UNKNOWN name entry is expected behavior after a RESETLOGS
operation. You should not attempt to remove UNKNOWN entries from the recovery catalog.