Administering VMware Site Recovery Manager 5.0 - Book Review

Administering VMware Site Recovery Manager 5.0
I received a copy of Mike Laverick's "Administering VMware Site Recovery Manager 5.0". This is a terrific book as the first book from VMware Press. Mike's has been providing terrific guides, white papers, and videos for years on his website RTFM Education.

To some the organization and presentation of this book may seem unconventional. Chapter 1 describes Site Recovery Manager, DR technologies, and addresses misconceptions of VMware technologies often thought of as DR technologies. Chapters 2 - 6 individually explain how to configure Dell, EMC Celerra, EMC CLARiiON, HP StorageWorks, and NetApp storage for VMware. Chapters 7 - 16 then cover the configuration and operation of VMware SRM. Chapters 1 plus one of 2 -6 make this book worthwhile to anyone installing a VMware solution with a SAN.

With my background being long in the teeth with networking and a little green in virtualization, Chapter 1 was most significant to me. I have been trying to understand the architectural differences and benefits of the different VMware technologies such as vMotion, High Availability Clusters, Fault Tolerance, and SRM.


Chapter 1, What is SRM, How Was DR Done Before SRM, and What VMware Technologies are Not DR

Chapter 1 provides an "Introduction to Site Recovery Manager". The chapter offers what is new in Site Recovery Manager 5.0 and, as Mike puts it, "what life was like before virtualization and before VMware SRM".

The original DR strategy was to have physical servers at the production and DR locations and rely on conventional backup and restore. For the next approach brought in virtualized servers, some suggested P2V technologies to synchronize physical servers with virtual servers. Either the production servers or the DR servers were virtualized. This approach requires the use of storage vendor's replication or snapshot technologies. This is needed to replicate the data files that make up the virtual machine's (VMX, VMDK, NVRAM, log, Snapshot, and/or swap files). Mike goes on to detail the other technology (changing IP addresses) and political issues (the storage group may not be the same people as the virtualization group) which need to be addressed. Again as Mike says, "It was again within this context that VMware engineers began working on the first release of SRM".

Next, this chapter discusses "What Is Not a DR Technology". The discussed VMware technologies provide some terrific benefits, and each have their place, but it is argued these should not be construed as DR Technologies.

As an example, some consider vMotion a DR technology. vMotion allows for virtual machines to be moved from one host to another. For vMotion to be remotely considered a DR technology, virtual servers need to be moved from one physical location to another. It is not acceptable DR design to have a DR data center within close proximity of a production data center (close enough to have your own fiber run, or where I live, to have two buildings along the same potential tornado path). To be considered a DR technology, vMotion needs to support moving virtual machines across some distance (I consider a minimum of 30 miles is necessary). Another necessary concept to understand is a vMotion is a planned event. That is, an administrator must initiate a vMotion, in a disaster scenario this is often not possible.

Finally (there is a lot in the 1st chapter) there is a discussion of the principles of storage management and replication. He does a great job of breaking through the "marketing speak" to generalize on the storage technologies most vendors support. In other words, Ford, GM, and Chrysler each offer Park, Reverse, Forward, and a radio, they may have entirely different methods of delivering these, but they all do it.


The Storage Vendors Chapters

Chapters 2 through 6 are dedicated to configuring specific vendors storage to work with VMware. Being somewhat new to VMware and also working in a place where I am exposed to multiple storage vendors, I really appreciated these chapters. These are great from those with limited experience configuring VMware to work with different vendors SANs. For me, these chapters were excellent. Mike provides terrific information for Dell, EMC, HP, and NetApp SAN platforms.While this doesn't cover every storage vendor, the basic principles apply to those not covered.



Installing, Configuring, and Customizing SRM
Chapter 7 explains installing SRM and thoroughly discusses planning and design, storage replication, and networking requirements. New VMware 5.0 features like automated failback, vSphere Replication, and bidirectional protection definitely add to the value and functionality of SRM. This chapter is very insightful for understanding the configuration of protected and recovery sites, storage replication planning and design, and configuring SRM workflow and recovery plans.

Mike walks the reader through the entire installation and configuration process with plenty of screenshots and real world examples. It is easy to follow along as he builds out a SRM solution. As the solution is built out, it covers advanced topics like customizations, scripting, and complex configurations.

The final chapter documents upgrading from SRM 4.1 to 5.0 which would be very helpful for readers still running VMware 4.1.



Summary
This is a terrific book from VMware Press. Mike Laverick has provided a well written and organized book. The chapters covering Dell, EMC, HP, and NetApp Storage Arrays are terrific. Administering VMware Site Recovery Manager 5.0 should be on the bookshelf of VMware and Storage admins.


Disclaimer: I received a complimentary copy of this book from VMware Press. I am not being compensated for this review. All views expressed are my own.

Cisco Unity vs. Unity Connection - Installation and Recovery Times

Network Engineer After Restoring Cisco Unity
Billy Carter Post Unity Restore
For several years Cisco has offered two Unified Communications voice messaging products. Unity, built on Windows Server, Exchange (or Lotus Domino), and MS SQL, and Unity Connection built on Linux and Informix.

I just spent 12 hours restoring a Cisco Unity system and thought this would be a good time to discuss the installation and disaster recovery process. I will skip the configuration steps to integrate with the phone system, create voice mail users, etc.


Overview of the Cisco Unity Installation Process


I have been building Cisco Unified Communications Systems (or VoIP systems for the ol'timers) since 2000. Regular Unity has always been a complicated and comprehensive installation. There are many steps including things like "click options 2,3 and 5", "before proceeding to the next step, install this patch on the Exchange server", "if the Partner Exchange server is version 20XX, install Engineering Special ES9".

There is really four installations; Windows Server install, MS SQL install, messaging platform install (either full Exchange for Voice mail only, minimal Exchange components for Unified Messaging, or Domino), and finally the Unity application install.

Thankfully Cisco provides separate Cisco Unity Installation Guides or each option. I have to say, after 12 years of Unity Installations, I still always have the guide right in front of me:
  1. Unity Unified Messaging Configuration with Exchange (with Failover Configured)
  2. Unity Unified Messaging Configuration with Exchange (without Failover Configured)
  3. Unity Voice Messaging Configuration with Exchange  (with Failover Configured)
  4. Unity Voice Messaging Configuration with Exchange  (without Failover Configured)
  5. Unity Unified Messaging Configuration with IBM Lotus Domino (with Failover Configured)
  6. Unity Unified Messaging Configuration with IBM Lotus Domino (without Failover Configured)
All of the guides have you install the operating system, SQL, and messaging backend, then they all have to be patched. The Cisco Unity Server Updates wizard automatically installs recommend updates. Depending on the server model, this step takes 1.5 to 2 hours (Prior to the wizard, this was a manual process that often included 93 reboots and took 4 to 8 hours.)

Now that the core components are installed, but before installing the Unity software, the Active Directory scheme is extended and special AD accounts are created (unityinstall, unityadmin, unitydir, and unitymsg).

The Permissions Wizard is run to give these new accounts special permissions in Active Directory. Then you have to manually delegate Exchange Administrative control to some of the special Unity AD accounts. In my experience, 90% of the time when Unity fails to work properly after installation, it is due problems with the Unity accounts and permissions not being properly assigned.


Note - You can see there is extensive changes made to Active Directory, and the Unity accounts have some significant and powerful rights to Active Directory. This DOES make the AD administrators nervous (as it should). After installation, if the Unity AD accounts have some permissions removed, it will break Unity.


Finally the Unity software is installed and connected to the messaging environment.

Total Unity Installation Time: 8 to 16 Hours


Overview of the Cisco Unity Connection Installation Process


There is a single Cisco Unity Connection Installation Guide per Unity Connection version. The installation is very simple:
  1. Put the installation DVD in the server and boot it up
  2. Follow the installation wizard to set IP Addressing, Primary DNS, NTP, Time Zone, DHCP settings, SMTP hostname, and X.509 Certificate information. 
  3. Set the application and operating system usernames and passwords
  4. Identify if the server is the 1st server in the cluster.
  5. To install a Unity Connection High Availability server, follow steps 1-4, but mark the systems as the 2nd server in a cluster and enter the information about the 1st system.
  6. Patch the system buy uploading the single update file and click install.
  7. If using Unified Messaging following the configuration steps.

Total Unity Connection Installation Time: 1 to 1.5 Hours


    Overview of Cisco Unity and Unity Connection Restore Process


    Unity uses the Disaster Recovery Tool (DiRT). DiRT allows you to back up and restore a Unity system. It is very important the exact same version of DiRT is used to backup and restore.

    Cisco Unity Restore Process:
    1. You should already have run a DiRT backup and stored the files off-box
    2. Follow the complete installation process from above (8-16 hours). The version of Unity installed must be the EXACT SAME version that was backed up (major version and Engineering Specials)
    3. Install and run DiRT restore (30 minutes to 1 hour)
    Cisco Unity Connection Restore Process:
    1. You should already have run the Disaster Recovery System backup
    2. Follow the complete installation process from above (1 to 1.5 hours). The information entered in installation steps 2 and 3 must be the same.
    3. The version of Unity Connection installed must be the EXACT SAME version that was backed up (major version and updates)
    4. Sign into the Disaster Recovery System and run the Restore Wizard (~30 minutes)

    Summary


    Unity Connection has several advantages over Unity:

    1. Much faster installation process, thus much shorter RTO
    2. Automated installation eliminates many steps which could break the system
    3. Less dependent on the Active Directory and Messaging environment
    4. Almost complete feature parity with Unity (Cisco says they almost have feature parity, but I can find any features missing that I or my customers want). In fact Unity Connection has many features not available on Unity
    5. Unity Connection is under active development with new features every release while Cisco has announced Unity End Of Life
    6. Its just an easier system  

    What do you think about Unity vs. Unity Connection?

    Radia Perlman Talk on TRILL and Spanning Tree

    Radia Perlman
    I found this YouTube Google Tech Talks presentation by Radia Perlman. She is often referred to as the "Mother of the Internet". She invented the spanning tree algorithm. She also invented concepts that made "link state routing" stable, scalable, and easy to manage. The protocol was adopted and renamed IS-IS. She is credited as creating the original concept of TRILL.

    Her presentation is titled "Routing Without Tears; Bridging Without Danger". She discusses the creation of spanning tree, link state routing protocols and finally TRILL or Transparent Interconnection of Lots of Links. Those of of working with network infrastructure and Cloud Computing can really appreciate everything she has done.




    The Best Solution is the Simplest Solution

    The Fiar of Network Simplicity
    As a consultant I am sometimes brought into, shall we say, challenging situations. Some situations are primarily politically challenging, others are technologically challenging.

    Today I have met a technically challenging situation. I am working on a network that is not, on the surface, much different than many others. In this case, the problem is someone has, from the technology or geek standpoint, created a very complex network. We have OSPF, EIGRP, and static routes. OSPF and EIGRP redistributing each other, and each redistributing static routes, plus back door links.

    Now this environment had some challenging networking issues to deal with. However I am thinking of my favorite philosophical law called  Ockham's Razor. "It is a principle urging one to select among competing hypotheses that which makes the fewest assumptions and thereby offers the simplest explanation of the effect."

    Sometimes the best solution is the simplest.

    Cisco Configuration Tip - 3rd Party SFP Modules

    Network Engineer's Assistant New Haircut
    It is possible to use non-Cisco SPF modules in a Cisco Catalyst switch. By default this is forbidden not allowed, but a top secret hidden command can make this happen.

    switch(config)#service unsupported-transceiver
    switch(config)#no errdisable detect cause gbic-invalid


    In the SFP modules EEPROM, a Serial Number, Vendor Name & ID, Security code and a CRC. The switch reads these values and if they are not "Cisco" values reports an error such as:


    %PHY-4-UNSUPPORTED_TRANSCEIVER: Unsupported transceiver found in Gi1/0/1
    %GBIC_SECURITY_CRYPT-4-VN_DATA_CRC_ERROR: GBIC in port 65538 has bad crc


    The official position from Cisco is:
    Q. Do the Cisco Catalyst 3750 Series Switches interoperate with SFPs from other vendors?
    A. Yes, starting from 12.2(25)SE release, the user has the option via CLI to turn on the support for 3rd party SFPs. However, the Cisco TAC will not support such 3rd party SFPs. In the event of any link error involving such 3rd party SFPs the customer will have to replace 3rd party SFPs with Cisco SFPs before any troubleshooting can be done by TAC.


    Cisco Press eBook Deal of the Day
    234X60