By using Active Directory Rights Management Services (AD RMS) and the AD RMS client, you can augment an organization's security strategy by protecting information through persistent usage policies, which remain with the information, no matter where it is moved. You can use AD RMS to help prevent sensitive information—such as financial reports, product specifications, customer data, and confidential e-mail messages—from intentionally or accidentally getting into the wrong hands.
RMS home page: www.microsoft.com/rms
The following scenarios of disaster recovery have been discussed in this white paper which will ensure a quick and fully functional AD RMS deployment in case of failures.
Access to your sensitive data depends on the continuous availability of various components in the AD RMS system. Each of the AD RMS components has varying degree of impact on data access. This white paper talks about all such potential breakdown points, degree of impact and mitigation plans.
If an AD RMS cluster node fails while there are other nodes still available in the same AD RMS cluster, the following process will enable full recovery.
In the next screen, provide the password for cluster key.
In the event that the last node in an existing cluster fails, or all of the nodes in an existing cluster become non-functional, the procedure remains same as mentioned in section 2.1 except for point number 2 mentioned below.
Note In both the above scenarios (Recovering from a Cluster Node Failure and Recovering from a Full Cluster Failure), in environments where SCP (Service Connection Point) for AD RMS service discovery is not registered in Active Directory, the option for “Join an existing AD RMS cluster” in the AD RMS installation wizard will be greyed out. To enable this option, we need to create the following registry key.
Key Details
The full registry subkey path for server-side service discovery is:
HKEY_LOCAL_MACHINE\Software\Microsoft\DRMS\
The following table lists the entry that you can add to enable “Join an existing AD RMS cluster” option.
Name
Type
Value
GicURL
String
http(or https)://server_name/_wmcs/certification/certification.asmx
Note In both the above scenarios (Recovering from a Cluster Node Failure and Recovering from a Full Cluster Failure), if the new AD RMS server names have changed and you want to clean up the old AD RMS server names from AD RMS management console, you will need to edit the DRMS_ ClusterServer table in the AD RMS configuration database using the following steps:
AD RMS cluster nodes cannot be restarted. If reboot, servers will not join the cluster until the database is available.
New AD RMS users, or existing users connecting from new computers or devices, will not be able to use AD RMS until connection to the database is restored, as the AD RMS certification pipelines will not be able to perform certification without access to the database. The same applies to existing users whose existing credentials expire, typically after one year from initial certification.
Exchange pre-licensing will not work until database connectivity is restored. Users will have to acquire licenses when consuming content since the pre-licensing functionality requires obtaining copies of the user’s RACs from the AD RMS configuration database. It is possible to configure AD RMS to pre-cache users RACs to speedup pre-licensing, and this will also enable Exchange pre-licensing to continue working offline when the AD RMS configuration database is not available.
It will not be possible to perform revocation of entities whose GUID needs to be obtained from the AD RMS databases, such as user’s RACs or workstations GUIDs.
Reporting will not be available until the AD RMS logging database becomes reachable.
If the Directory Services Cache database is unavailable, all the AD RMS group membership queries will be redirected to the global catalogs servers. There is no noticeable reduction in RMS services when this table is not available for short periods of time.
During this period, the AD RMS nodes will continue to operate and log operations, but the information generated by logging of AD RMS operations will continue to be stored in each node’s local message queue, and it will be flushed to the database when connectivity to the database server is restored.
In case of an AD RMS database failure, there might be the following two possible disaster recovery scenario.
Note Do not reboot any AD RMS server until the database operation is restored, unless it is desired to stop the AD RMS service altogether.
Prepare the new database server which involves the following (Refer Appendix B for more information):
Restore the database to the new SQL server.
NOTE: If CNAME record was not used earlier for SQL server, then we need to create a CNAME record instead of pointing to the physical server and do the following additional steps:
This scenario is most appropriate when the local data center site has failed or the SQL storage has failed and need to bring the AD RMS services functional at a remote data center site.
Stop the existing database server. Fail over to the secondary database server (by changing the appropriate DNS server record or using some other redirection mechanisms).
Note For more information on SQL log shipping and exporting the AD RMS databases, see Appendix D and Appendix A.
If for any reason the AD RMS database servers are destroyed and there’s no valid, functional backup or secondary database containing valid data to restore the AD RMS cluster to a valid working state, the following process should be followed:
Figure 4: Exporting TPD file (includes Server Licensor Certificate and AD RMS cluster key)
NOTE: By default, an AD RMS Licensing Server can issue use licenses for only content where it originally issued the publishing license. In some situations, this may not be acceptable. By adding a TPD trust policy, it allows for one AD RMS cluster to issue use licenses against publishing licenses that were issued by a different AD RMS cluster. You add a trusted publishing domain by importing the server licensor certificate and private key of the server to trust.
The following are examples of when TPD trust policy is added to an AD RMS cluster:
Install a new AD RMS cluster:
Delete the existing Service Connection Point from AD as shown in figure 5. This is critical as the existence of a registered Service Connection Point will prevent the installation of a new certification cluster in the same forest.
Figure 5 : Deleting AD RMS Service Connection Point (SCP) from AD
Install a new node on a new AD RMS certification cluster with the same AD RMS URLs, pointing it to the new AD RMS database.
If it has been decided to use a different AD RMS URL instead of the actual/old one, then the following additional actions are necessary:
Import the Trusted Publishing Domain from the existing cluster. This will import the cluster’s private key definition and Server Licensor Certificate, which will enable the new cluster to issue licenses against documents protected with the old cluster as shown in the following figure.
Figure 6: Importing Trusted Publishing Domain file
Re-create any existing Rights Policy Templates using definitions similar to the ones in the old cluster. While importing the TPD will also import definitions of all the existing templates, the existing templates will be imported as Archived templates, not as Distributed Rights Policy templates. So the old templates will be available to the server in order to issue licenses to previously protected content, but new templates will be required for the users to be able to protect new documents.
It is recommended that the DRM folder in all the user’s personal profiles are deleted via a script, as this will make them begin using the new cluster keys.
In any organization there’s often a need to identify content (typically in the form of documents or email) related to certain proceedings and grant access to those materials to specialized personnel. Another common situation involves the need for recovering information protected by employees without their cooperation, for example, because they no longer work for the company.
AD RMS provides tools and capabilities to regain access to protected documents in different situations, in either an automated or systematic manner or as individual recovery or search operations.
Documents protected with AD RMS can be stored in different locations, among them:
There are three common situations where access to protected information is needed:
The documents containing the information are already in the hands of the persons requiring access.
The documents are known to be located in a certain location but the particular documents containing the information in question are not identified.
There’s a need to proactively identify all documents pertaining to a certain matter and archive them in unprotected or accessible form.
In the first case, which is common when auditors have access to a user’s workstation and they want to read or unprotect a particular piece of information found in the user’s machine, access to the documents can be enabled by making that person, either temporarily or permanently, a member of the SuperUsers group and enabling SuperUsers functionality in AD RMS. Refer Figure 7.
When a user is a member of the AD RMS Superusers group that user is granted any license it requests, so the user can view, copy or unprotect the content at will. Obviously this functionality has to be managed in a very controlled way.
Enabling SuperUser’s group:
Figure 7: Enabling SuperUsers group
For additional step-by-step guidance on enabling the SuperUsers group, see Configure the AD RMS Super Users Group.
Another alternative for dealing with this case is to allow one person that is a member of the SuperUsers group to perform bulk decryption of all documents in a certain location, and then handling the protected documents to the person requiring access. The information can then be indexed and searched using normal tools for the task.
Considering that the information is likely sensitive, a formal and secure process for dealing with these proceedings needs to be defined.
For this task, Microsoft has published a tool called the AD RMS Bulk Protection Tool which can be used to encrypt files via the command line or, more importantly in this case, unprotect them. The bulk protection tool can be combined with a script to search all protected files in a system and unprotect them, allowing someone performing discovery full access to all the files in the system.
The Bulk Protection Tool can work not only on file shares, but also on emails and attachments stored in a PST. This way emails archived into a PST can also be unprotected in bulk, indexed and searched as needed. Typically, the bulk protection tool will be combined with SuperUser privileges in order to access files or emails in a user’s workstation.
The Bulk Protection tool can be downloaded from http://www.microsoft.com/downloads/details.aspx?displaylang=en&FamilyID=f9fbe58f-c175-41d0-afdc-6f160ab809cd.
Figure 8 shows a very simple usage scenario of bulk protection tool. Figure 8: AD RMS Bulk Protection tool usage
When files are stored in a protected SharePoint library, they are stored in the database in unprotected format, and they are only protected when downloaded via the SharePoint interfaces. So a person performing e-discovery only needs to be granted access rights over the SharePoint library in order to be able to perform searches or downloads of protected documents. Alternatively, by granting that person direct rights over the SQL Server database acting as the back-end of the SharePoint library the user will be able to extract the unprotected files directly from the database.
When information needs to be automatically and proactively decrypted for performing automated e-discovery or archival, similar solutions typically allow automating the task of unprotecting documents.
Since the Bulk Protection tool can also work with files stored in file shares, it can be also used combined with scripts and scheduled tasks, or with the File Server Resource Manager that’s part of Windows Server 2008 R2, to automatically create unprotected backups of protected files deposited in the file share. Once unprotected files can be accessed and indexed as desired.
Decommissioning allows an RMS cluster to be put in a state that will allow all existing documents to be unprotected. It is normally only done only when the use of AD RMS will be fully removed from an organization. To eliminate an AD RMS cluster in situations where other AD RMS clusters will continue to operate, a better solution might normally be to implement a Trusted User Domain (TUD) instead.
The following section provides step-by-step guidance steps to put an AD RMS cluster into the decommissioning state.
To complete the decommissioning process, you will want to modify permissions on the decommissioning pipeline. To do this, first you will want to grant the Active Directory Rights Management Services Service Group both Read & Execute permissions on the decommission folder. Next, you will want to give Everyone both Read & Execute permissions on the decommission.asmx file. The decommission pipeline is located in the %systemroot%\inetpub\wwwroot\_wmcs folder, where %systemroot% is the volume on which Windows Server 2008 is installed. For more information, refer to Figure 10 below.
Figure 10: Read & Execute rights for Everyone on Decommissioning pipeline
Configure the Active Directory Rights Management Services-enabled applications on the clients to obtain a content key from the decommissioning service and permanently decrypt the rights-protected content.
After you believe that all of the content is unprotected and saved, you should export the server licensor certificate. Then AD RMS nodes can be uninstalled. After uninstalling the last node, confirm that the AD RMS Service Connection Point has been removed in AD. If it hasn’t, it can be removed manually by deleting it from the AD RMS Sites and Services MMC, by using the PowerShell interface.
In a worst case DR scenario, the following backups are required:
A backup of SQL databases- Frequency of back up mentioned below
Configuration DB – A valid backup after each configuration change on the AD RMS cluster is a must.
Directory Services Cache DB - Can be restored to any state, including the empty initial state, as it will be regenerated as needed. Hence no recommendation on frequency.
Logging DB – Can be restored to an empty state or to a recent state if it contains information of a period that’s of interest for reporting or troubleshooting. If report generation is crucial, then a daily backup (or more frequent) of this database is required. In which ever state it is restored, it does not affect the AD RMS functionality.
A backup of Trusted Publishing Domain (TPD) – One time backup of TPD right after AD RMS is installed in the AD Forest. Please refer Appendix A.
The following steps cover how to full export all existing AD RMS databases for disaster recovery preparation:
The first step in exporting the AD RMS databases is to export or ensure you have backup of the trusted publishing domain (TPD) into an XML file. The following procedure helps explain how to accomplish this process. To export the trusted publishing domain from your current AD RMS cluster deployment
This will close the Export Trusted Publishing Domain As box.
Figure 1: Exporting Trusted Publishing Domain (TPD)
The next sequence of tasks to accomplish for preparing the export of the AD RMS databases is to stop dependent services and ensure that any pending acitivity that would make the databases inconsistent once exported (or when later restored) have been is resolved. This involves the following:procedures:
To stop the IIS (Web Server) service
Ensure the Messaging Queue is Empty
This step explains how to verify the Microsoft Message Queuing is emptied and stop the AD RMS Logging Service. AD RMS uses MSMQ on each server in the AD RMS cluster to send information to the logging database. This needs to be done prior to backing up the AD RMS logging database.
Log on to AD RMS cluster node.
Click Start, point to Administrative Tools, and then click Server Manager.
In the console treeview on the left, expand Features, expand Message Queuing, expand Private Queues, expand drms_logging_rms_domain_com_443, and select Queue messages. This will populate the middle pane with Queue messages.
Verify there are no messages remaining in Queue messages.as shown in the following figure.
Figure 3: MSMQ is empty
Stop the AD RMS Logging Service
Figure 4: Stop AD RMS logging service
To back up the AD RMS databases
Log on to the SQL server computer that hosts and stores your AD RMS databases.
From the Start menu, select All Programs, then click Microsoft SQL Server 2008 and then click SQL Server Management Studio.
This will bring up the Connect to Server dialog box. Ensure that the Server name is correct and that Authentication is set to Windows Authentication and then click Connect.
In the console treeview in SQL Serve Management Studio, expand Databases, then right-click DRMS_Config_rms_domain_com_443, select Tasks and then select Back Up.
Figure 5: Backup configuration database This will bring up the Back Up Database – DRMS_Config_rms_domain_com_443 window as shown in the following step.
In the Destination section as shown in the figure below, click Add and select the location.
Figure 6: Backup Configuration Database
Click OK to finish the backup.
Repeat the above steps to backup logging and directory services cache database.
Before pointing the AD RMS cluster to a new SQL database server, following needs to be done:
For disaster recovery purposes, it is a best practice to refer to the SQL server by a CNAME record and not by the physical server name. This allows for the SQL Server to be called something other than its proper name when a connection attempt is being made. In order to use a CNAME record with a SQL Server, the DisableStrictNameChecking registry key must be added and the value set to 1. This key allows connections to be made to the SQL server by names other than the proper name. By default, SQL Server 2008 will not allow this. Follow the procedure below to implement the registry change:
This step explains how to enable the firewall rules on the new SQL server. These rules are required to allow the AD RMS cluster to communicate with the SQL Server.
Click Start, select Administrative Tools and click Windows Firewall with Advanced Security.
This will open the Windows Firewall with Advanced Security management console.
Figure 2 : Windows Firewall Advanced Security
On the left, select Inbound Rules and on the right click New Rule.
This will bring up the New Inbound Rule Wizard.
Figure 3: Inbound Rule Wizard
On the Protocol and ports screen, select TCP and enter 445 in the box next to Specific local ports: and then click Next.
Figure 4: Firewall Protocols and Ports
On the Action screen, select Allow the connection and click Next.
Figure 5: Action: Allow the connection
On the Profile screen, select Domain, Private, and Public then click Next.
Figure 6: Rule profile
On the Name screen, enter SQL Server Named Pipes in the box and click Finish.
Table 1 – SQL Server Firewall Port Exceptions
This section explains how to enable the allowed network protocols for the SQL server that supports your AD RMS deployment. This is done so that the AD RMS server can communicate with the SQLO database server. To enable network protocols for the SQL server computer that supports your AD RMS deployment
In SQL Server Configuration Manager, on the left, expand SQL Server Network Configuration and click Protocols for MSSQLSERVER. This will populate the right pane with four protocols and their status.
Figure 8: Protocols for MSSQLSERVER
On the right, right-click Disabled next to Named Pipes and select Enable.
Figure 9: Enable named pipes
This will bring up a pop-up box such as the following that says any changes made will be saved, however, they will not take effect until the service is stopped and restarted.
In SQL Server Configuration Manager, on the left, click SQL Server Services. This will populate the right pane with three services and their state.
Figure 10: Stop and Start SQL server service
This step explains how to add the AD RMS Service Account to SQL Logins on SQL server. This allows the service account to connect to SQL server.
On the Select User or Group box, enter domain\service account in the box below Enter the object name to select (examples) and click Check Names. This should resolve with an underline. Click Ok.
Figure 12: Select AD RMS service account
This step explains how to change the CNAME record in DNS. This will allow the AD RMS cluster to point to the new SQL server by canonical name and not by the physical server name.
On the properties page, enter the new SQL server name under Fully qualified domain name (FQDN) for target host: and click OK.
Figure 13: DNS CNAME record for SQL server
For more information, see Change CNAME Record in DNS.
This step explains how to restore the AD RMS databases on a new SQL server.
Figure 4: Locate backup file
This appendix only provides an overview on SQL Server log shipping and how we can leverage it for quick restoration AD RMS services in a disaster recovery scenario. Step by step guidance on configuring log shipping is out of scope of this white paper. For more information, see Log Shipping (Database Engine).
You can use log shipping to send transaction logs from one database (the primary database) to another (the secondary database in a remote site) on a constant basis. Continually backing up the transaction logs from a primary database and then copying and restoring them to a secondary database keeps the secondary database nearly synchronized with the primary database. In a scenario where the local site database server fails due to storage failure or natural calamity, AD RMS services can be restored by using the remote database server.
Log shipping consists of three jobs. Each job performs one of the following operations:
The following diagram describes log shipping.
The log can be shipped to multiple secondary server instances. In such cases, operations 2 and 3 are duplicated for each secondary server instance.
A log shipping configuration does not automatically fail over from the primary server to the secondary server. If the primary database becomes unavailable, any of the secondary databases can be brought online manually.
In Log shipping, there are three distinct types of server roles used.
The primary server in a log shipping configuration is the instance of the SQL Server Database Engine that is your production server. The primary database is the database on the primary server that you want to back up to another server. All administration of the log shipping configuration through SQL Server Management Studio is performed from the primary database.
The primary database must use the full or bulk-logged recovery model; switching the database to simple recovery will cause log shipping to stop functioning.
The secondary server in a log shipping configuration is the server where you want to keep a warm standby copy of your primary database. A secondary server can contain backup copies of databases from several different primary servers. For example, a department could have five servers, each running a mission-critical database system. Rather than having five separate secondary servers, a single secondary server could be used. The backups from the five primary systems could be loaded onto the single backup system, reducing the number of resources required and saving money. It is unlikely that more than one primary system would fail at the same time. Additionally, to cover the remote chance that more than one primary system becomes unavailable at the same time, the secondary server could be of higher specification than the primary servers.
The secondary database must be initialized by restoring a full backup of the primary database. The restore can be completed using either the NORECOVERY or STANDBY option. This can be done manually or through SQL Server Management Studio.
The optional monitor server tracks all of the details of log shipping, including:
The monitor server should be on a server separate from the primary or secondary servers to avoid losing critical information and disrupting monitoring if the primary or secondary server is lost. A single monitor server can monitor multiple log shipping configurations. In such a case, all of the log shipping configurations that use that monitor server would share a single alert job.
For more information, see Monitoring Log Shipping.
Log shipping involves four jobs, which are handled by dedicated SQL Server Agent jobs. These jobs include the backup job, copy job, restore job, and alert job.
The user controls how frequently log backups are taken, how frequently they are copied to each secondary server, and how frequently they are applied to the secondary database. To reduce the work required to bring a secondary server online, for example after the production system fails, you can copy and restore each transaction log backup soon after it is created. Alternatively, perhaps on a second secondary server, you can delay applying transaction log backups to the secondary database. This delay provides an interval during which you can notice and respond to a failure on the primary, such as accidental deletion of critical data.
Backup Job
A backup job is created on the primary server instance for each primary database. It performs the backup operation, logs history to the local server and the monitor server, and deletes old backup files and history information. By default, this job will run every 15 minutes, but the interval is customizable.
When log shipping is enabled, the SQL Server Agent job category "Log Shipping Backup" is created on the primary server instance.
SQL Server 2008 Enterprise and later versions support backup compression. When creating a log shipping configuration, you can control the backup compression behavior of log backups. For more information, see Backup Compression (SQL Server).
Copy Job
A copy job is created on each secondary server instance in a log shipping configuration. This job copies the backup files from the primary server to a configurable destination on the secondary server and logs history on the secondary server and the monitor server. The copy job schedule, which is customizable, should approximate the backup schedule.
When log shipping is enabled, the SQL Server Agent job category "Log Shipping Copy" is created on the secondary server instance.
Restore Job
A restore job is created on the secondary server instance for each log shipping configuration. This job restores the copied backup files to the secondary databases. It logs history on the local server and the monitor server, and deletes old files and old history information. The SQL Server job category "Log Shipping Restore" is created on the secondary server instance when log shipping is enabled.
On a given secondary server instance, the restore job can be scheduled as frequently as the copy job, or the restore job can delayed. Scheduling these jobs with the same frequency keeps the secondary database as closely aligned with the primary database as possible to create a warm standby database.
In contrast, delaying restore jobs, perhaps by several hours, can be useful in the event of a serious user error, such as a dropped table or inappropriately deleted table row. If the time of the error is known, you can move that secondary database forward to a time soon before the error. Then you can export the lost data and import it back into the primary database.
Alert Job
If a monitor server is used, an alert job is created on the monitor server instance. This alert job is shared by the primary and secondary databases of all log shipping configurations using this monitor server instance. Any change to the alert job (such as rescheduling, disabling, or enabling the job) affects all databases using that monitor server. This job raises alerts (for which you must specify alert numbers) for primary and secondary databases when backup and restore operations have not completed successfully within specified thresholds. You must configure these alerts to have an operator receive notification of the log shipping failure. The SQL Server Agent job category "Log Shipping Alert" is created on the monitor server instance when log shipping is enabled.
If a monitor server is not used, alert jobs are created locally on the primary server instance and each secondary server instance. The alert job on the primary server instance raises errors when backup operations have not completed successfully within a specified threshold. The alert job on the secondary server instance raises errors when local copy and restore operations have not completed successfully within a specified threshold. A Typical Log Shipping Configuration
The following figure shows a log shipping configuration with the primary server instance, three secondary server instances, and a monitor server instance. The figure illustrates the steps performed by backup, copy, and restore jobs, as follows:
The primary and secondary server instances send their own history and status to the monitor server instance.
A fault tolerant and highly available AD RMS infrastructure gives the users the continuous ability to protect and consume rights content. Fortunately AD RMS is, by design to some extent fault tolerant for protection and consumption of rights content for the following reasons:
Though tolerant to some extent, for situations where the client needs activation/renewal of client machine or user certificates it still needs connectivity to the AD RMS server. A client will also need to contact the AD RMS server for the initial consumption of a non-prelicensed piece of content or for consuming a document after any previously acquired license has expired. These situations are not uncommon and hence we need to design the AD RMS infrastructure to be highly available. The following are the three server side components of AD RMS which needs high availability
AD RMS servers always communicate with AD global catalog (GC) servers for group expansion and hence the AD RMS servers have to always have access to an AD GC in order to work effectively. Though to reduce the response time for licensing requests, AD RMS leverages the local Active Directory cache (on each RMS server in the root cluster and licensing-only cluster) or the shared Active Directory cache database, availability of the GC is of utmost importance for any new group expansion requests or for expired group expansion results in the cache. None of the above cache can be a proper substitute for a GC and any DC might be out of service at certain times, at least two GC’s should be implemented in the same AD site as AD RMS server.
AD RMS leverages three databases for its functionality. The various tasks performed by AD RMS using these three databases are listed below:
Considering the above, the partial loss of functionality in most cases is tolerable while the database is not available for shorter duration of time like less than few hours. But this does not mean that the database itself is not important. AD RMS is not tolerant to database corruption or database disk failures in which case the entire rights protected data is lost. This in turn means that the high availability of database is not as significant as protecting the database itself from failure (database corruption, disk/hardware failure etc.) Hence it is very important to choose the right high availability solution.the following are available solutions for high availability with AD RMS:
NOTE: An efficient and quickest way of restoring database and pointing the AD RMS servers to the new database server is to use CNAME records for the SQL server. For more information on CNAME records in AD RMS, see The Importance of CNAME Record (http://technet.microsoft.com/en-us/library/ff660011(WS.10).aspx).
Considering the above, following is the loss of functionality when AD RMS servers are unreachable.
AD RMS is implemented as web service on Internet Information Services on Windows. Hence like any other web application, adding additional AD RMS nodes is the most preferred method to provide high availability. An AD RMS cluster can be a single-server AD RMS installation or several AD RMS servers installed in a load-balancing environment to handle requests from AD RMS-enabled clients. All the AD RMS nodes of a single cluster point to the same database for creation/retrieval of configuration settings and logging. Since clients are going to be talking to the URL stamped in the protected documents to acquire a license, you need to map that to the load-balanced IP address. This is why it is important never to configure the AD RMS URL to refer to the physical name of the first AD RMS server in the cluster. Always use a DNS alias (a manually configured Host A record) for the AD RMS URL which clients will point to.