Top of page
Hello, guest
dmoz.org - Open Directory Project

FreeBSD for performance/stability in service Ubuntu for super-fast client OS
  1. Do you use lighttpd web server ?

      Last Updated: 2010-09-02 15:00
Active Directory Replication - 2
Author: M. Bora Teoman
read more
Link to ipsure.com home page Link to ipsure.com home page Link to ipsure.com Services page Link to ipsure.com Hands-On blog Link to ipsure.com Life and Business blog Link to ipsure.com Contact page Link to Turkish version of ipsure.com Top right header image Link to About page Link to References page Link to Terms Of Use page Link to RSS Feed

02/02/2010

Dial-tone Recovery of Exchange 2007 Server

Filed under: backup, ms tip — Tags: , , , , , , — M. Bora Teoman @ 14:46

DIAL-TONE RECOVERY OF EXCHANGE 2007 SERVER IN VMWARE ESX 3.5 ENVIRONMENT USING DATAPROTECTOR 6.1

In this article, i will talk about a dial-tone recovery of Exchange 2007 server in an VMware ESX 3.5 environment using HP Data Protector 6.1 product. I will briefly explain the environment and problem, then will talk about the recovery process.


PRODUCTS USED IN THE ENVIRONMENT

1-      VMware ESX 3.5 Update 3 (host machine)

2-      Windows 2008 Enterprise 64bit operating system (guest machine)

3-      Exchange Server 2007 Enterprise SP2 (installed on guest machine)

4-      Data Protector 6.1 (backup solution)

5-      Fiber Channel storage area (SAN)

INFRASTRUCTURE

ESX host machines are connected to the fiber channel SAN storage. They have a redundancy because two independent fiber cable are used for connection with two independent HBA on host machines. There are more than one ESX host in the environment and they are using the “High Availability” technology of VMware. By this technology, guest machines can be migrated between ESX hosts in case of a problem. Therefore, there won’t be any service outage and guest machines run continuously.

One of the guest operating systems is Exchange 2007 server installed on Windows 2008 Server. LUNs located on the external SAN storage device are binded to the guest operating system as “Raw Mapping Device” (RMD). By this method, guest operating system can write the data on external harddisk drive directly. Data Protector 6.1 product is used for backing up Exchange 2007 mailbox databases.

DESCRIPTION OF PROBLEM:

One of the ESX servers is taken into Maintenance Mode in order to update the ESX server’s version 3 to 5.   While it is going into Maintenance Mode, all the guest operating systems are transfered to the other ESX hosts automatically (which includes Exchange 2007 server). During the transferring process, the other ESX hosts couldn’t handle the load (CPU and RAM) and restart themselves.After restart, 2 of the ESX hosts could not open properly and they are restarted again via power button. After  successful restarts, all the guest operating systems are transfered to their original locations by using VCenter interface.

All the guest operating systems are working well, but Exchange 2007. One of the Raw Mapping Devices is shown as “Unallocated Space” under Computer Management of Windows operating system (there are three RMD for mailbox databases and one RMD for Transaction Logs). The disk is holding one of the mailbox databases and it couldn’t be reached anymore. Therefore users having mailbox on that database couldn’t use their mailboxes. Another side effect is the queue on Hub Transport server which is growing as mails are coming to these mailboxes.

WHAT IS DONE FOR PROBLEM SOLVING

We couldn’t know the exact problem so we open the old documentation to check whether any difference has been occured. There is not much difference, but one on the paths of RMDs. Because ESX host two independent paths to storage area, there are two independent paths to the same LUN for ESX host. In documentation, path value is vmhba0:2:14:0 but after problem occurred it is vmhba0:1:14:0 (Figure 1).

Figure 1: Changed RMD path

But in reality this couldn’t be the problem because the paths are ending at the same LUN on the other end. In addition, all of the RMD path has changed but only one of them is not working. Therefore this won’t be the problem. Just in case, we try to change the path to its original but it doesn’t work out (Figure 2).

Figure 2: Preferred path for LUNs

After that, we check the .vmx file of the guest machine on ESX. This is a file that is holding the configuration of virtual machine. We open another virtual machine’s .vmx file as a reference also. While we compare the two files, we see few differences between them. The most important difference is  on a disk configuration line in the file. It was MB01_2.vmdk but now it is MB01_7.vmdk (Figure 3). We change it to its original value and restart the virtual machine. It doesn’t work either. Disk is still “Unallocated Space” in Windows Computer Management.

Figure 3:.vmx file of virtual machine (unrelated fields were erased)

We try to mount the disk to another virtual machine as a RMD. When we boot the other machine, disk is still “Unallocated Space”.

We don’t have any solutions in ESX side so we try to do something in Windows side. We open the disk management and right click the Unallocated Space and select “Simple Disk…”. We end the wizard without formatting and give disk a drive letter (Figure 4). No luck.

We think that partition table may be corrupted for the disk. Therefore we use an open source offline product to recover the partition table but still no luck.

Figure 4: RAW disk alanı

It has been for 1-1,5 hour doing all these studies and users still could not reach their mailboxes. Therefore we decide to restore the mailboxes from backup. We open the documantation “How to use Data Protector 6.0 or 6.10 with Exchange Recovery Storage Groups to restore a single mailbox” from www.hp.com and as it is said there, we create a “Recovery Storage Group” (RSG) on our Exchange 2007 server. For this purpose, we open the EMC console on the server and double click Toolbox on the left side. We click the “Database Recovery Management” link and as the wizard starts, we enter a descriptive name for the job (Figure 5) and click Next. On the second screen we click “Create a recovery storage group” link (Figure 6). The storage groups defined on this server is listed immediately and we select the related one and then click Next (Figure 7). Exchange Server relates the storage group with the Recovery Storage Group in this screen. By doing this, it lets the backup software to restore the recovering data to the  database file of Recovery Storage Group, not the actual database file. After checking the paths in the last screen, we click the “Create the recovery storage group” link and finish the wizard (Figure 8).

Figure 5: Screen for label / server name entrance

Figure 6: Screen for job selection

Figure 7: Screen for storage group selection

Figure 8: Screen for Recovery Storage Group creation

Now we have an RSG so we can restore the backup. For this purpose we open the “Restore” menu in Data Protector 6.1 management interface and expand the “MS Exchange Server”. We find the related server (mb01.mstip.com) and expand it.  Then we double click the “MS Exchange Server [Microsoft Exchange Server (Microsoft Information Store)]”. In the right side, database and transaction logs that were backed up is listed. By the way, i have to talk about the backup schedule. We have FULL backup of the mailbox databases  for Sunday at 11:00 PM and rest of the week we have INCREMENTAL backups at 11:00 PM. The disaster strikes on Wednesday morning therefore we need FULL backup of Sunday and INCREMENTAL backups of Monday and Tuesday.

We select the FULL backup of our lost database on the right side and for Transaction Log backups, we select the INCREMENTAL backup. But there is a trick here. When we select the INCREMENTAL backup of Transaction Log, date of the backup is shown as 5.January.2010 TUESDAY. This is the last INCREMENTAL backup that was taken for Transaction Logs. If we restore this backup, we couldn’t restore the Transaction Log files between last FULL and last INCREMENTAL backups. Therefore we right click the /StorageCG/LOGS/Logs line and select Properties (Figure 9).

Figure 9: Properties of StorageCG/LOGS/Logs backup

In the properties window, we have to choose the next INCREMENTAL backup which was taken after the last FULL backup form “Backup version” list. But there is another detail here. There is another FULL labeled backup in this list which was taken just after the FULL database backup (Figure 9). This FULL backup is the backup of Transaction Logs which were created during the FULL backing up process of mailbox database. Data Protector labels these backup of Transaction Logs as “Full” (1/4/2010 12:11:19 AM full).

We select the Full labeled backup in this window and press the OK button. In Options Tab, we have to choose where we have to restore the backups. We fill the “Restore to another client” box and select the related server name (mb01.mstip.com) from the list. We write c:\tmp (which is a path on the target system) to the “Directory to temporary log files”.We don’t fill the two boxes at the bottom and press the Restore button (Figure 10).

Figure 10: Options Tab

After some time, Data Protector successfully restore the backup and says that we can write additional files to this restored files. The reason for this message is that we don’t select the two boxes at the bottom before restore operation. By doing that we say Data Protector that this restore operation is a part of one whole restore operation and we will write more files later. Therefore we can add the other INCREMENTAL backup files to this operation.

The above operation restored the last FULL database backup and the Transaction Log files that was produced during backing up time. The next restore operation is for Monday INCREMENTAL backup (1/4/2010 11:05:11 PM incr). For this purpose, we right click the /StorageCG/LOGS/Logs line, select properties and choose   1/4/2010 11:05:11 PM incr backup. After pressing OK button, we uncheck the Database FULL backup box (/StorageCG/STORE/MailboxesCG) for not to restore database backup again. In Options Tab, we do not touch anything and press Restore button. After restoring, Data Protector says that we can write additional files to this restored files, again.

So there is not much left about Data Protector. Only Tuesday INCREMENTAL backup has to be restored. We right click the the /StorageCG/LOGS/Logs line again, select Properties and choose the INCREMENTAL backup taken on Tuesday night (1/5/2010 11:05:07 PM incr). We close this window by pressing OK button and in Options Tab, we fill the  “Last restore set” and “Last Consistent state” boxes. By filling these boxes, we say that “we will not restore any more backup file after this operation. This is the last backup set fot restore operation”. We press the Restore button and after a while, Data Protector successfully restore the backup set. We are finished with Data Protector so we can concentrate on Exchange RSG database from now on.

We open the EMC on Exchange 2007 Server and select Toolbox. We double click the “Data Recovery Management” and select “Mount/Dismount Database” on the next screen. The restored database is listed on the next screen and we select it. After we press the Next link, a “Database is successfully mounted” message is shown. We press the Previous link to go back.

Now we can recover the mailboxes. First of all, we have to create a new and empty database for the failed one. We connect a new disk to our Exchange server and give the same drive letter as it was before the failure. Also we create the same folder structure as it was before. After all these preperation, we go to EMC interface and right click the failed database file (at the right side when we select Server Configuration -> mb01.mstip.com) and select “Mount database”. An error is shown after this action. In normal conditions, we don’t have to face this error message but we are an exception J. In our situation, we couldn’t reach the main database file but we can reach the related Transaction Log files (which are stored in a different and healthy disk partition). Therefore EMC  can not relate the Tranaction Log files with the new database file. We delete all of the files of Transaction Log and do the same mounting action again. This time a warning message is appeared on the screen which says “the database file is not found under the related folder. A new and empty database file will be created instead”. We press the YES button for this warning and Exchange create a new and empty database file for the failed one. The upside here is: we have a database and users can use their mailboxes. The downside is: They can not see their old messages.

We have two choices for recovering. By using “Dial-Tone recovery” method, we can either merge the restored database file with the newly created empty one, or we can switch the new database file with the old recovered database file. We choose the second choice because it is faster than the former one.

On “Data Recovery Management” screen, we click on the “Swap databases for ‘dial-tone’ scenario” link (Figure 11). This leads to another screen which we can select the database in RSG (Figure 12). We have only one database therefore we control the related information (server name, recovery storage group name, linked storage group etc) and click on “Gather swap information” link. The next screen is the last one (figure 13) and all the information is listed here. There is only one choice we can do. Click the “Perform swap action”. And that is all. It switches the databases in a second. Actually it is not switching the databases, it updates the path of the database file on Active Directory.

Figure 11: Task selection screen

Figure 12: Database selection screen from RSG

Figure 13: Select database swap options screen

After this action there is only one thing we have to do is, to recover the users mails which were received during recovery process. So we go back to task center and click on “Merge or Copy Mailbox contents” link. On the next screen we click on “Gather merge information” link and click on the “Perform pre-merge tasks” link on the next screen. After this action, wizard searches for the mailboxes on the recovered database (actually this is the database which includes the newly received mails) and lists the users that it finds. We select all of them and click on “Perform merge actions” link. It takes a while (depends on how much mail have been received) and after the merge action, wizard adds the newly received mails to the users actual mailboxes (merge old mails with new mails).

After all these actions, users may have to reopen their outlook clients once. That is all :)

Related Posts with Thumbnails

Related Posts

  1. Exchange Transition Scenario (2003 to 2007) – Day 5
  2. Exchange Transition Scenario (2003 to 2007) – Day 3
  3. Exchange Transition Scenario (2003 to 2007) – Day 1

No Comments »

Trackbacks

There has not been any trackback links yet.

Reader Comments

There are currently no reader comments available at this time.

RSS feed for comments RSS feed for comments on this post. TrackBack URL

Leave a comment