Tài liệu Disaster Recovery: Backing Up and Restoring - Pdf 90

Disaster Recovery:
Backing Up and
Restoring
E
very MIS or network administrator has a horror story
to tell about backing up and restoring systems or data.
One organization, where we now manage more than a dozen
backup servers, has data processing centers spread all over
the United States, and all are inter-connected via a large pri-
vate wide area network. In mid-1999, a valuable remote
Microsoft SQL Server machine just dropped dead. The IT
doctor said it had died of exhaustion . . . five years of faithful
service and never a day’s vacation. After trying everything
to revive it, we instructed the data center’s staff to ship the
server back to HQ for repairs.
The first thing we asked the IT people at the remote office
was: “You’ve been doing your backups everyday right?” “Sure
thing,” they replied. “Every day for the past five years.” They
sounded so proud, we were overjoyed. “Good, we will have to
rebuild your server from those tapes, so send them all to us
with the server.” To cut a frustrating story short: The five
years’ worth of tapes had nada on them, not a bit nor a byte.
Zilch. We spent two weeks trying to make sense of what was
on that SQL Server computer and to rebuild it. We refuse to
even guess the cost of that loss.
We have another horror story we will later relate, but this
example should make it clear to you that backup administra-
tion, a function of disaster recovery, is one of the most impor-
tant IT functions you will have the fortune to be charged with.
Backup administrators need to be trained, responsible, and
cool people. They need to be constantly revising and refining

on your company’s needs, this may vary from a week to a couple of weeks, or from
a month to a couple of months, and even years. There is no point buying media for
annual backups for a site you know is due to close in six months.
What To Back Up
Often, administrators back up every file on a machine or network and dump the
whole pile into a single backup strategy. Instead, they should be splitting up our
files into two distinct groups: System and Data.
✦ System files comprise files that do not change between versions of the applica-
tions and operating systems.
✦ Data files comprise all the files that change every day, such as word-processing
files, database files, spreadsheets files, media files, graphics files, and configura-
tion files (like the registry, DHCP, WINS, DNS, and the Active Directory data-
bases). Depending on your business, data files can change from 2 percent a day
on the low side to 80 percent a day on the high side. The average in many of the
businesses for which we have consulted is around 20 percent of the files chang-
ing every day. And, you must also consider the new files that arrive.
Understanding the requirements will make your life in the admin seat easier,
because this is one of the most critical of all IT or network admin jobs. One per-
son’s slip-up can cause millions of dollars in data loss. How often have you backed
up an entire system that was lost for some reason, only to find that to restore it,
you had to reinstall from scratch? “So why was I backing up the system,” you might
4667-8 ch17.f.qc 5/15/00 2:07 PM Page 644
645
Chapter 17 ✦ Disaster Recovery: Backing Up and Restoring
have asked yourself. And how often have your restored a file for a user who then
complained he or she lost five days’ worth of work on the file because the restore
was so outdated. It’s happened to us on many occasions and is very disheartening
if you are trying so hard to keep your people productive.
There is nothing worse than trying to recover lost data, knowing that all on Mahogany
Row are sitting idle, with the IT director standing behind you in the server room, and

the outside casing, and in spreadsheets, hard catalogs, or data ledgers in some
form or another. Without history data, restore media will be unable to locate your
files and the backup will be useless. This is why it is possible to prepare a tape for
overwriting by merely formatting the label so that the magnetic head thinks the
media is blank.
There are various types of backups, depending on what you back up and how often
you back it up:
✦ Archived backup: A backup that documents (in header files, labels, and
backup records) the state of the archive bit at the time of copy. The state (on-
off) of the bit indicates to the backup software that the file has been changed
since the last backup. When Windows 2000 Backup does an archived backup,
it sets the archive bit accordingly.
✦ Copy backup: An ad-hoc “raw” copy that ignores the archive bit state. It also
does not set the archive bit after the copy. A copy backup is useful for quick
copies between DR processes and rotations, or to pull an “annual” during the
monthly rotation (we discuss this later).
✦ Daily backup: This does not form part of any rotation scheme (in our book
anyway). It is just a backup of files that have been changed on the day of the
backup. We question the usefulness of the daily backup in Backup, because
mission-critical DR practice dictates the deployment of a manual or auto-
mated rotation scheme (described later). Also, Backup does not offer a sum-
mary or history of the files that have changed during the day. If you were
responsible for backing up a couple of million files a day . . . well, this just
would not fly.
✦ Normal backup: A complete backup of all files (that can be backed up), period.
The term “normal” is more a Windows 2000 term because this backup is more
commonly called a “full” backup in DR circles. The full backup copies all files
and then sets the archive bit to indicate (to Backup) that the files have been
backed up. You would do a full backup at the start of any backup scheme. You
would also have to do a full backup after making changes to any scheme. A full

tion of information between the last backup and the disaster — a period we call void
recovery time.
Understanding How Backup Works
A collection of media, such as tapes or disks, is known as a backup set (this is differ-
ent from a media pool, which we will discuss in a bit). The backup set is the backup
media containing all the files that were backed up during the backup operation.
Backup uses the name and date of the backup set as the default set name. Backup
allows you to either append to a backup set in future operations or replace or over-
write the files in the media set. It allows you to name your backup set according to
your scheme or regimen.
Backup also completes a summary or histories catalog of the backed-up files, which
is called a backup set catalog. If your backup set contains several media, then the
catalog is stored on the last medium in the set, at the end of the file backup. The
backup catalog is loaded when you begin a restore operation. You will be able to
select the files and folders you need to restore from the backup catalog.
Removable Storage and Media Pools
Removable Storage (RS) is a new service in Windows 2000 that takes away a lot of
the complexity of managing backup systems. This service also brings network sup-
port to Windows for a wider range of backup and storage devices.
4667-8 ch17.f.qc 5/15/00 2:07 PM Page 647
648
Part V ✦ Availability Management
Microsoft took the responsibility of setting up backup devices and management of
media away from the old Backup application and created a central authority for
such tasks. This central authority is known as Removable Storage and is one of
the largest and most sophisticated additions to the operating system, worth the
price of the OS license alone, and a welcome member on any network. If you are
not ready to convert to a Windows 2000 network, you might consider raising a
Windows 2000 “Backup” server just to obtain the services of Removable Storage.
But Removable Storage is like an iceberg. In this chapter and in other parts of the

compmgmt.msc
). The Removable Storage node is also present in the Remote
Storage snap-in discussed in Chapter 21. Before we begin with any hard-core backup
practice, let’s look at Removable Storage and how it relates to backup and disaster
recovery. Removable Storage is also briefly discussed in Chapter 16.
4667-8 ch17.f.qc 5/15/00 2:07 PM Page 648
649
Chapter 17 ✦ Disaster Recovery: Backing Up and Restoring
Figure 17-1: The Removable Storage Snap-in
The service provides the following functionality to backup applications, also known
as backup or data moving and fetching clients:
✦ Management of hardware, such as drive operations, drive health and status,
and drive head cleaning
✦ Mounting and dismounting of cartridges and disks (media)
✦ Media inventory
✦ Library inventory
✦ Access to media and their properties
Access to the actual hardware is hidden from client applications. But the central
component exposed to all clients is the media pool. To better understand the media
pool concept in Removable Storage, let’s first discuss media.
Backup media ranges from traditional tape cartridges (discussed at the end of this
chapter) to magnetic disk, optical disk CD-ROM, DVD, and so on. More types of
media are becoming available, such as “sticks” and “cards” that you can pop into
cameras and pocket-sized PCs, but these are not traditional backup media formats,
nor can they hold the amount of data you would wish to store. DVD, a digital video
standard, however, is a good choice for backing up data because so much can be
stored on a single DVD disk.
Like the dynamic disk management technology discussed in Chapter 16, Removable
Storage hides the physical media from the clients. Instead, media is presented as a
logical unit, which is assigned a logical identifier or ID. When a client needs to store

database.
Physical Locations
Removable Storage also completely handles the burden of managing media location,
a chore once shared between the client applications and the administrator. But the
physical location service deals with more than knowing in which cupboard, shoe-
box, vault, or offsite dungeon you prefer your media stored in; it is also responsible
for the physical attributes of the hardware devices used for backing up and restoring
data. It is worthwhile to understand this section, because you will need such knowl-
edge to perform high-end backup services that protect a company’s data.
Removable storage splits the location services into two tiers: libraries and offline
locations. If a media is online, then it is inside a tape device of some kind that can
at any time be fired up to allow data to be accessed or backed up. If media is offline,
then it means that you have taken it out of its drive or slot and sent it somewhere.
Note
4667-8 ch17.f.qc 5/15/00 2:07 PM Page 650
651
Chapter 17 ✦ Disaster Recovery: Backing Up and Restoring
As soon as you remove media from a device, Removable Storage makes a note in
its database that the media is offline.
Libraries can be single tape drives or highly sophisticated and very expensive
robotic storage silos comprising hundreds of drive bays. A CD-R/W tower, with 12
drives, is also an example of a library. Media in these devices or so-called libraries
are always considered online, and are marked as such in the database. Removable
Storage also understands the physical components that make up these devices.
Library components comprise the following:
✦ Drives: All backup devices are equipped with drives. The drive machinery
consists of the recording heads, drums, motors, and other electronics. To
qualify as a library, a device requires at least one drive.
✦ Slots: Slots are pigeonholes, pits, or holding pens in which online media is
placed, in an online state. When media is needed for a backup, a restore, or a

652
Part V ✦ Availability Management
✦ Insert/Eject Ports: The IE ports are not supported on all devices. IE ports pro-
vide a high degree of controlled access to the unit in a multi-slot library sys-
tem. In other words, you insert media into the port, and the transport goes
and finds a free slot for it. Another way to comprehend the IE port function is
to compare it to a valet service. You hand your car keys to the valet, and he or
she goes and finds a free parking space for you.
If the hardware you attach supports any or all of these sophisticated features,
Removable Storage will be able to “discover it” and use it appropriately.
There are dozens, if not hundreds, of devices from which to choose for backing up
and storing data. Removable Storage, as we discussed, can handle not only tradi-
tional tape backup systems, but also CD silos, changers, and huge multi-disk read-
ers. If you wish to check if Removable Storage supports a particular device, follow
the steps to create a media pool discussed in the section “Performing a Backup”
later in this chapter.
Media Pools
A new term in the Windows operating system is the media pool. If you are planning
to do a lot of backing up or have been delegated the job of backup operator or
administrator, you will have a lot to do with media pools in your future backup-
restore career.
A media pool in the general sense of the term is a collection of media organized as
a logical unit. Conceptually speaking, the media pool contains media that belong to
any defined storage or backup device, format, or technology assigned to your hard-
ware, be it a server in the office or one located out on the WAN somewhere, 15,000
miles away. However, each media pool can only represent media of one type. You
cannot have a media pool that combines DVD, DAT, and ZIP technology. But you can
back up your data to multiple media pools of different types if the client application
or function so requires it.
It may be easier to think of the media pool in terms of the hardware devices that are

✦ Unrecognized pools: Media in these pools are not known to Removable
Storage. If the service cannot read information on a cartridge, or if the car-
tridge is blank, the media pool supporting it is placed into this grouping.
✦ Import pools: This group is for media pools that were used in other Removable
Storage systems, on other servers, or by applications that are compatible with
Removable Storage or that can be read by Removable Storage. Media written
to by the Microsoft Tape Format (MTF) can thus be imported into the local
Removable Storage system.
Application pools
When an application is given access to a free media pool, either it will create a spe-
cial pool into which the media can be placed or you can create pools manually for
the application using the Removable Storage snap-in, illustrated in Figure 17-1.
A very useful and highly sought after feature of Windows 2000 media pools is that
permissions can be assigned to pools to allow other applications to use the pools
or to protect the pools in their own sets.
Multi-level media pools
It might astonish you to find out that media pools can be organized into hierarchies
or nests. In other words, you can create media pools that hold several other media
pools. An application can then use the root media pool and gain access to the dif-
ferent data storage formats in the nested media pools. Expect to see sophisticated
document storage, backup, and management applications using such media pools.
4667-8 ch17.f.qc 5/15/00 2:07 PM Page 653
654
Part V ✦ Availability Management
An example of using such a hierarchy of media pools can be drawn from a near dis-
aster that was averted during the writing of this chapter. One of our 15-tape DLT
changers went nuts and began reporting that our tapes were not really DLT tapes
but alien devices it was unable to identify. The only way to continue backing up our
server farm was to enlist every SCSI tape and disk device on the network into one
large pool. Once the DLT library recovered, we could go back to business as usual.

Operator requests
No matter how sophisticated Removable Storage is, there are some things it just
will not do. These items will be marked for the “human” work queue. For example,
Removable Storage will not go and fetch cartridges from the cabinet or the store-
room. This is something you have to do. The details pane in the Operator Requests
node is where Removable Storage posts its request states for you, the operator.
Removable Storage can also send you a message via the messenger service or the
system tray, just in case you have the habit of pretending the Operator Requests
node does not exist. Table 17-2 lists the possible Operator Request States.
Table 17-2
Operator Request States
State Explanation
Submitted The described request has been submitted, and the system is waiting for
the operator’s input.
Refused The operator has refused to perform the described request.
Completed The operator has complied and has completed the described request.
Labeling Media
Removable Storage can read data written to the labels on the actual tape or mag-
netic disk as well as external information supplied in bar code format. The identifi-
cation service is robust and highly sophisticated and will ensure that your media
does not get overwritten or modified by other applications.
You need to provide names for your media pools, and you should also, if you can
afford a bar code reader, organize them according to serial numbers (represented
as bar codes) for more accurate handling. If you are planning to install a library
system, make sure you get one that can read the bar codes from the physical labels
on the cartridge casing. This information will be critical when it comes to locating
a few files that need restoring from five million files stored on 120 30GB tapes
(the bigger the enterprise, the more complex the backup and restore regimen
and management).
Another reason we prefer a numbering or bar code scheme for identifying media, as

files of your data, and that it is safe to destroy the data on the scratch media.
It is important to fully understand the concept of save and scratch sets because it
is the only way you will be able to ensure your media can be safely recycled. The
alternative is to make every set a save set, which means you never recycle the
tapes . . . making your DR project a very costly and risky venture because tapes
that are being constantly used will stretch and wear out sooner.
Establishing Quality of Support Baselines for
Data Backup/Restore
Windows 2000 provides the administrator with backup and recovery tools seen
before only on midrange and mainframe technology (such as the ability to mark
files for archiving). For the first time, Windows network administrators are in a
much better position to commit to service level agreements and quality of service
or support levels than before. Unfortunately, the new tools and technologies result
in a higher and more critical administrative burden (the service level shifts to the
Windows administrator as opposed to being usually the domain of the midrange,
UNIX, or mainframe administrative team). Let’s consider some of the abstract
issues related to backups before we get into procedures.
No matter how regularly you back up the data on your network, you can only restore
up to the point of your last complete backup. Unless you are backing up every second
4667-8 ch17.f.qc 5/15/00 2:07 PM Page 656
657
Chapter 17 ✦ Disaster Recovery: Backing Up and Restoring
of the day, which is highly unlikely and impractical, you can never fully recover the
latest data up to the point of meltdown (unless you had a crash immediately after
you backed up). You need to decide how critical it is that your business cannot afford
to lose even one hour of data. For many companies, any loss could mean serious set-
back and costly recovery, often lasting long after the disaster occurs.
It is important, therefore, that you consider the numerous alternatives for backup
procedures and various strategies if out-of-date data is considered inadequate
recovery. You need to decide on a baseline for backup/restores: What is the least

Note
4667-8 ch17.f.qc 5/15/00 2:07 PM Page 657
658
Part V ✦ Availability Management
common hard-disk array or a central storage facility. Loss of data is thus system-
wide and mirrored across the entire array. A mirror is a reflection: no more, no less.
This brings us to another factor to consider: the flawed backup. You bring this fac-
tor into consideration if your data is continuously changing. The question to ask
is, “How soon after the update of data should I make a backup?” You may decide,
based on the previous list, that data even five minutes out of date is damaging to
system integrity or the business objectives. A good example is online real-time
order or delivery tracking. But backing up data with such narrow intervals between
versions brings us to the subject of quality and integrity of backed-up data. (Later
in this chapter, we will discuss versioning and how new technology in Windows
2000 facilitates it.) What if the file that just got hit by a killer virus is quarantined
and you go to the backup only to find it is also infected or corrupt? What if all the
previous files are infected, and now just opening the file renders it useless? It’s
something to think about.
Earlier this year, we rushed to the aid of our main SQL Server group, which had lost
a valuable database on the customer ordering system (on our extranet). Every hour
offline was costing the company six figures as customers went elsewhere to place
their orders. Four-letter words were flying around the server room. We had to go
back three days to find a clean backup of the database that showed no evidence of
corrupt metadata.
Figure 17-2 illustrates data backed up on a daily basis, and in this case, bad data is
backed up for three days in a row. You may consider some of the gray area as safe,
where backup data is bound to have all the flaws of its source (corruption, viruses,
lack of integrity, and so forth), if you have other means of assuring quality or data
integrity. Such assurances may be provided by means of highly sophisticated anti-
virus software, quality of data routines and algorithms, versioning, and just making

Figure 17-3 shows this in a visual hierarchy.
123
A
Backing up once a day
= data integrity
4 5 6
10 20 30
B
Backing up at 10 minute intervals daily
40 50 60
4667-8 ch17.f.qc 5/15/00 2:07 PM Page 659
660
Part V ✦ Availability Management
Figure 17-3: The data restoration pyramid
The pyramid in Figure 17-3 illustrates that the faster the response to a restore or
recall of data request, the higher the chance of retrieving poor data. Each layer of
the pyramid covers the critical level of the restore request. This does not mean that
critical restores are always going to be a risk and that the restored data is flawed.
It means that the data backed up closest to the point of failure is more likely to be
at risk compared to data that was backed up hours or even days before the failure.
If a hard disk crashes, the data on the backup tapes is probably sound, but if the
crash is due to corrupt data or virus infection, the likelihood of recent data being
infected is high.
Another factor to consider is that often you’ll find that the “cleanest” backup data
is the furthest away from the point of restoration, or the most out-of-date.
If the level of restore you need is not as critical or the quality of the backup not
too important, you could consider a tape drive system either to a backup server
or local to the hosting machine. You could then set up a scheme of continuous or
hourly backup routines. In the event data is lost (usually because someone deletes
a file or folder), you would be able to restore the file. The worst-case scenario is


Nhờ tải bản gốc

Tài liệu, ebook tham khảo khác

Music ♫

Copyright: Tài liệu đại học © DMCA.com Protection Status