| Oracle®
Database Backup and Recovery Basics 10g Release 1 (10.1) Part Number B10735-01 |
|
|
View PDF |
This chapter offers guidelines and considerations for developing an effective backup and recovery strategy.
This section includes the following topics:
To decide on backup strategies, start with your data recovery requirements and y our data recovery strategy. Each type of data recovery will require that you take certain types of backup.
Failures can run the gamut from user error, datafile block corruption and media failure to situations like the comple te loss of a data center. How quickly you can resume normal operation of your database is a function of what kinds of restore and rec overy techniques you include in your planning. Each restore and recovery technique will impose requirements on your backup strategy, including which features of the Oracle database you use to take, store and manage your backups.
When thinking about recovery strategies, ask yourself questions like these:
BLOCKRECOVER co
mmand can help you in this situation. Also, troubleshoot recovery with the SQL*Plus RECOVER ... TEST<
/code> command.With these needs in mind, decide how you can take advantage of features related to backup and recovery, and look a t how each feature meets some requirement of your backup strategy. For example:
Once you decide which features to use in your recovery strategy, you can plan your backup strategy, answering the following questions, among others:
OPEN RESETLOGS or after changes to your database such as NOLOGGING operations that do not appear in the redo log. You may also have business requi
rements that require backups for auditing purposes or other reasons not related to database recovery.
These are of course only a few of the considerations you should take into account. Available resources (hardware, media, staff, budget, and so on) will also be factors in your decision.
Your data recovery strategy should include responses to any number of database failure scenarios. The key to an effective, efficient strategy is envisioning failure modes, matching Orac le database recovery techniques and tools to the failure modes in which they are useful, and then making sure you incorporate the nec essary backup types to support those recovery techniques.
To help match failure modes to re covery techniques that can help resolve them, refer to the following sections:
Your backup and recovery strategy should enable you to handle situations in which a user or application makes unwanted changes to database data, such as deleting the contents of a table or making incorrect updates during a batch run. The goal in such a case will be to restore the affected parts of your database to their state before the user error.
Depending on the situation, your appropriate response will be one of the follow ing:
|
Note: Oracle's Flashback Technology provides faster and less disruptive alternatives to media recovery in many circumstanc es.
Information about these features is collect ed in Oracle Database Backup and Recovery Advanced User's Guide. This document will allude to such features where they can be helpful and provide pointe rs for more information. Familiarize yourself with these features before planning your backup and recovery strategy, because you may find that they can be quite valuable and require limited advanced planning. |
A media failure occurs when a problem external to the database prevents Oracle from reading from or writing to a file during database operations. Typical media fai lures include physical failures, such as head crashes, and the overwriting, deletion or corruption of a database file. Media failures are less common than user or application errors, but your backup and recovery strategy should prepare for them.
The type of media failure determines the recovery technique to use. For example, the strategy you use to recove r from a corrupted datafile is different from the strategy for recovering from the loss of the control file.
a>The method of recovery from loss of all members of an online log gr oup depends on a number of factors, such as:
For example:
OPEN RESETLOGS. You will lose all transactio
ns that were in the lost log. You should take a new full database backup immediately after the OPEN RESETLOGS. Backups f
rom before the OPEN RESETLOGS will not be recoverable because of the lost log.OPEN RESETLOGS with no transaction loss. However, you should take a new full database backup. Backups from before the OPE
N RESETLOGS will not be recoverable because of the lost log.ALTER DATABASE CLEAR LOGFILE stateme
nt to re-create all members in the group. No transactions are lost. If the lost redo log group was archived before it was lost, then
nothing further is required. Otherwise, you should immediately take a new full backup of your database. Backups from before the log w
as lost will not be recoverable because of the lost log.If a small number of blocks within one or more datafiles are corrupt, you ca
n perform block media recovery instead of restoring the datafiles from backup and performing co
mplete media recovery of those files. The Recovery Manager BLOCKRECOVER command can be used to restore and recover speci
fied data blocks while the database is open and the corrupted datafile is online.
Oracle Database Backup and Recovery Advanced User's Guide to le arn how to perform block media recovery with RMAN.
Your plans for data recovery strategies are the basis of your plans for backup strategy . This discussion describes general guidelines that can help you decide when to perform database backups, which parts of a database y ou should back up, what tools Oracle provides for those backups, and how to configure your database to improve its robustness and mak e backup and recovery easier. Of course, the specifics of your strategy must balance the needs of your restore strategy with question s of cost, resources, personnel and other factors.
The set of files needed to recover an Oracle database from the failure of any of its files--a datafile, control file, or online redo log--is called the redundancy set. The redundancy set sh ould contain:
tnsnames.ora, and listener.oraThe first rule of protecting your redundancy set is:
The set of disks or other media that contain the redundancy set for your database should be separate from the disks that contain the datafiles, online redo logs, and control files.
This practice ensures that the failure of a disk that contains a datafile does not also cause the loss of the back ups or redo logs needed to recover the datafile. Consequently, a minimal production-level database requires at least two disk drives: one to hold the files in the redundancy set and one to hold the database files. Ideally, separate the redundancy set from the primar y files in every way possible: on separate volumes, separate file systems, and separate RAID devices. Keeping the redundancy set sepa rate from the primary files ensures that you will not lose committed transactions in a disk failure.
The simplest way to manage your redundancy set is to use a flash recovery area, on a separate device from the working set f iles. All recovery-related files will be stored in a single location on disk, disk space usage is managed automatically, backups requ ired to meet your data recovery requirements are never deleted from disk while they are still needed, and recovery time is minimized without compromising the completeness of the redundancy set.
Whether or not you use a flash recovery area, Oracle Corporation recommends following these guidelines:
Ideally, the multiplexed files should be on different disks mounted under different disk controllers. The flash recovery area is an excellent location for one copy of these file s.
You can also mirror the online redo logs and current control file at the operating syst em or hardware level, but this is not a substitute for multiplexing at the database level.
The redo logs of your database provide a complete record of changes to the datafiles of your database (wit h a few exceptions, such as direct path loads).
You can run your database in one of two mod es: ARCHIVELOG mode or NOARCHIVELOG mode. In ARCHIVELOG mode, a used online redo log group must be copied to one or more archive dest inations before it can be reused. Archiving the redo log preserves all transactions stored in that log, so that they can be used in r ecovery operations later. In NOARCHIVELOG mode, the online redo log groups are simply overwritten when the log is re-used. All inform ation about transactions recorded in that redo log group is lost.
Running your database in NOARCHIVELOG mode imposes severe limitations on your backup and recovery s trategy.
If you are running in NOARCHIVELOG mode and you must reco ver from damage to datafiles due to disk failure, you have two main options for recovery:
For most applications, running in ARCHIVELOG mode is preferable to runn ing in NOARCHIVELOG mode because you have more flexible recovery options after a data loss. There are, however, associated costs of r unning in ARCHIVELOG mode:
When performance requirements are extreme or disk
space limitations are severe, it may be preferable to run in NOARCHIVELOG mode in spite of the restrictions imposed.
It is recommended that you take ad vantage of the flash recovery area to store as many backup and recovery-related fileas as possible, including disk backups and archiv ed redo logs.
Some features of Oracle database backup and recovery, such as Oracle Flashbac k Database, require the use of a flash recovery area. In such cases, you must create a flash recovery area, though you do not have to use it to store all recovery-related files.
Even when its use is not required, however, th e flash recovery area offers a number of advantages over other on-disk backup storage methods. Backups moved to tape from the flash r ecovery area are retained on disk until space is needed for other required files, reducing the need to restore backups from tape. At the same time, obsolete files no longer needed to meet your recoverability goals and files backed up to tape become eligible for dele tion and are deleted when space is needed, eliminating the need for DBA intervention to clear out old files.
| See Also:
"Setting Up a Flash Recovery Area for RMAN" for more about the uses and benefits of the f lash recovery area. |
Backup retention policy can be base d on redundancy or a recovery window. In a redundancy-bas ed retention policy, you specify a number n such that you always keep at least n dist inct backups of each file in your database. In a recovery window-based retention policy, you specify a time interval in the past (for example, one week, or one month) and keep all backups required to let you perform point-in-time recovery to any point during that wi ndow.
A backup no longer needed to satisfy the backup retention policy is said to be
RMAN automates the implementation of a backup retention policy, using the following commands:
CONFIGURE RETENTION POLICY command lets you set the retention policy that wi
ll apply to all of your database files by default.REPORT O
BSOLETE command lets you list backups currently on disk that are obsolete under the retention policy. You can also specify par
ameters to see which files would be obsolete under different retention policies.DELETE OBSOLETE command deletes the files which REPORT OBSOLETE would list as obsolete.CHANGE... KEEP lets you set a separate retention policy
for specific backups, such as long-term backups kept for archival purposes. You can specify th
at a given backup must be kept until a future time, or even specify that a backup be kept forever. CHANGE... NOKEE
P is used to let the retention policy apply to a backup previously protected by CHANGE... KEEP.If you use a flash recovery area to store your backups, the database will delete obsolete
backups automatically as disk space is needed for newer backups, archived logs and other files. For backups stored on disk outside a
flash recovery area and for backups stored on tape, you should periodically run the DELETE OBSOLETE command to remove o
bsolete backups.
A recovery win dow-based retention policy lets you guarantee that you can perform point-in-time recovery to any point in the past, up to a number of days that you specify. The earliest point in time to which you can recover your database under your retention policy is known as the point of recoverability. All backups required for recovery or point-in-time recovery back to t hat time will be retained.
Note that this will generally require that you keep backups olde r than the beginning of the recovery window. A point-in-time recovery to the beginning of the recovery window would require a restore from this backup, and then applying all changes between the backup time and the point of recoverability. For example, you might conf igure a recovery window of three days:
RMAN> CONFIGURE RETENTION POLICY TO RECOVERY WIND OW OF 3 DAYS;
If your last full database backup was six days ago, RMAN will keep the six-day-old backup, and all redo logs required to roll the database forward to the beginning of the recovery wind ow three days ago, in addition to any backups and redo logs needed to recover the database to all points in time within the three day window.
A recovery window-based backup retention policy provides the most certain recovera bility for your data. The disadvantage is that more careful disk space planning is required, since it may not be obvious how many bac kups of datafiles and archived logs must be retained to guarantee the recovery window.
A redundancy-based backup retention policy determines whether a backup is obsolete based on how many backups of a file are currently on disk. You might configure a redundancy level of 3:
RMAN> CONFIGURE RETENTION POLICY TO REDUNDANCY 3;
In this case, RMAN keeps three backups of each database file, and all redo logs required to recover all retained datafile backups t o the current time. Any older backups will be considered obsolete.
Assume, for instance, th at you make backups of a datafile every day, starting on a Monday. On Thursday, you make your fourth backup of the datafile, and the backup from Monday becomes obsolete because you have the backups from Tuesday, Wednesday and Thursday. On Friday, the backup from Tue sday becomes obsolete, because you have the backups from Wednesday, Thursday and Friday.
There are several reasons to keep older backups of datafiles and archived logs:
To perform point-in-time recovery to a given target time earlier than your current poi nt of recoverability, then you need a database backup that completed before the target time, as well as all of the archived logs crea ted between the time the backup was started and the target time. For example, if you take full database backups starting at 1:00 AM o n February 1 (at SCN 10000) and on February 14 (at SCN 20000), and if you decide on February 28 to use point-in-time recovery to brin g your database to its state at 9:00AM February 7 (SCN 13500), then you must use the February 1 backup, plus all redo logs containing changes from between the beginning of the creation of the backup (SCN 10000) and 9:00AM February 7 (SCN 13500).
Note that point-in-time recovery to a time between backups is not an option for a database operating in N
OARCHIVELOG mode. You can only restore your entire database from a consistent whole database backup, and re-open the database
as of the time of that backup. You will lose all changes since the backup was taken.
Frequent backups are essential for any recovery scheme. Base the frequency of backups on the rat e or frequency of database changes such as:
The more fre quently your database is updated, the more often you should perform database backups. The scenario in "Bac kup Scripts When Blocks Change Frequently" backs up the database every week.
If database updates are relatively infrequent, then you can make whole database backups infrequently and supplem ent them with incremental backups (which will be relatively small because few blocks have changed). The scenario in "Backup Scripts When Few Data Blocks Change" describes how to develop a backup strategy based on a single whole database backup.
There are times when you will need to take a backup of your database independent of your re gular backup schedule. If you make any of the following structural changes, then perform a backup of the appropriate portion of your database immediately before and after completing the following changes:
The part of the database that you should back up depends on your archiving mode.
If you run in ARCHIVELOG mode, then you can back up an individual tab
lespace or even a single datafile. You might want to do this for one or more tablespaces that are updated much more often than the re
st of your database, as is sometimes the case for the SYSTEM tablespace and automatic undo tablespaces.
More frequent backups of heavily-used datafiles can shorten recovery times in some situations. You may have a database where most updates are restricted to a small set of tablespaces. If you take a full database backup each Sunday, then rec overy from a media failure affecting the frequently updated tablespaces on Friday requires re-applying large amounts of redo. Daily b ackups of the frequently-updated tablespaces reduces the amount of redo to apply without requiring a daily full database backup.
| See Also:
Oracle Datab ase Administrator's Guide for information about managing undo tablespaces |
When a direct path load is performed to populate a database, no r
edo data is logged for those database changes. You cannot recover these changes after a restore from backup using conventional media
recovery. Likewise, when tables and indexes are created as NOLOGGING, the database does not log redo data for these obje
cts, which means that you cannot recover these objects from existing backups. Therefore, you should back up your datafiles after oper
ations for which no redo data is logged.
|
N ote: You can use either a full backup of your datafiles or an incremental backu p. Either one will capture all changed blocks, including blocks changed by unrecoverable operations. |
| See Also:
Or
acle Database SQL Reference for information about the |
| See also:
Oracle Database Utiliti es for more details about exporting and importing data for logical backup |
Online redo logs, unlike ar chived logs, should never be backed up. The chief danger associated by having backups of online redo logs is that you may accidentall y restore those backups without meaning to, and corrupt your database.
Online redo log back ups are also not particularly useful, for the following reasons:
ARCHIVELOG mode, then the archiver is already archiving the filled redo logs automatically.
NOARCHIVELOG mode, then the only type of phy
sical backups that you can perform are closed, consistent, whole database backups. The files in this type of backup are all consisten
t and do not need recovery, so the online logs are not useful after a restore from backup.The best method for protecting the online logs against media failure is to multiplex them, with multiple log members in each gr oup, on different disks attached to different disk controllers.
|
Note: RMAN does not permit you to back up online redo log s. You must archive a redo log before backing it up. |
During the stress of a recovery situation, it is import ant that you have all necessary information at your disposal. This is especially true if for some reason you need to contact Oracle S upport because you run into a problem that you do not understand. You should have the following documentation about the hardware conf iguration:
You should also keep the following documentation about the software configuration:
You should keep this information both in electronic and hardcopy form. For example, if you sa ve this information in a text file on the network or in an email message, then if the entire system goes down, you may not have acces s to this data.
It is especially important to keep a record of the DBID. If you have to res tore and recover your database including the loss of the SPFILE and control file, you will need the DBID during the recovery process. See "Basic Database Restore and Recovery Scenarios" for details on h ow the DBID is used during recovery.
Practice backup and recovery techniques in a test environment before and after you move to a production system. In this way, you can measure the thoroughness of your strategies and minimize problems before they occur in a real situation. Perfor ming test recoveries regularly ensures that your archiving, backup, and recovery procedures work. It also helps you stay familiar wit h recovery procedures, so that you are less likely to make a mistake in a crisis.
If you us
e RMAN, then run the DUPLICATE command to create a test database using backups of your production database. If you perfo
rm user-managed backup and recovery, then you can either create a new database, a standby database, or a copy of an existing database
by using a combination of operating system and SQL*Plus commands.
| See Also:
Oracle Database Backup and Recovery Advanced User's Guide to learn about RMAN testing methods, troubleshooting SQL*Plus recovery, block media recovery, and RMAN disaster recovery |
The RMAN BACKUP V
ALIDATE and RESTORE VALIDATE commands can be a useful part of your recovery plan testing. BACKUP VALIDATE
code> reads all of the specified files but does not produce any output files. All of the data blocks in the input files are validated
, exactly as they are when a real backup takes place. RESTORE VALIDATE reads all of the backup files that would be neede
d to restore the specified objects, but the objects are not actually restored to disk. All of the data blocks in the backup files are
validated, exactly as they are when a real restore takes place. Just as in a real restore, RESTORE VALIDATE automatical
ly chooses which backup files to restore from. For example, the command RESTORE VALIDATE DATABASE ensures that, for ever
y file in the database, a valid backup exists, can be read, and contains valid data.
| See Also:
"Validating RMAN Backups" for more details on using |