Continuous data protection system trumps tape backup

Joanne Cummings

17 years ago

Howard Rice Nemerovski Canady Falk & Rabkin, a San Francisco law firm, knows what competing against the big guys is like. Although it has just 135 lawyers and 400 employees overall, its client list includes such heavyweights as Citigroup, Google, HP, the Oakland Raiders football team and Sony Online Entertainment.

Still, the firm didn’t want to have to compete against the many far-larger companies in the earthquake-prone Bay Area when it came to disaster recovery.

“We looked at outsourcing DR (disaster recovery),” says Matthew Reynolds, CIO at Howard Rice. “But . . . we can’t compete against those with deeper pockets. I’d rather be the owner of my destiny.”

At the time the firm began looking at disaster recovery, it also was struggling with a less-than-reliable tape backup system. Overall, Howard Rice needed to back up as much as 12TB of data every night, including its mission-critical Microsoft Exchange e-mail system. In addition, as the amount of data ratcheted up, so did backup time.

“Even though Exchange was well tuned, backups were extending well into the business day” and overall performance was degrading, Reynolds says. “We needed something better,” he says.

Real-time replication

About a year and a half ago, Reynolds began scoping out ways to do more real-time disk-based backups. Among his choices were emerging continuous data protection (CDP) appliances that also would allow the firm to have almost real-time disaster-recovery capabilities. After a year of searching, he found InMage Systems’ DR-Scout.

“We looked at everything, and there were shortcomings with every product,” Reynolds says. “InMage was not a proven product and wasn’t widely deployed. But it had the features we needed,” he says.

For example, InMage can replicate any kind of data, be it unstructured files or Exchange, SQL Server and even SharePoint data. “That was a business requirement for us,” Reynolds says. “When we implemented, the SharePoint part wasn’t quite ready, but when I looked at how quickly it was able to get the Exchange-replication component to market, I was impressed. And now it has SharePoint and it’s bulletproof.”

Another deciding factor was InMage’s ability to assign variable recovery-point objectives (RPO) depending on the application.

“We wanted to throttle up or down our overall RPO, based on the system,” Reynolds says. “For Exchange, we wanted essentially near-real-time recovery, but some of our other business systems that are SQL Server-based don’t require that. The RPO could be three hours. We wanted that flexibility.”

Bidirectional replication was key, too. “Replication going in one direction — a lot of vendors do that,” Reynolds says. “But you really need bidirectional, so that once the host site is up and operational, you can replicate that information back from the DR site to the main site. That was the riddle we wanted to solve, and we wanted to do it as transparently as possible,” he says.

Virtualization comes into play

The final criterion was DR-Scout’s ability to support VMware server virtualization. Howard Rice had just moved to a new data center and standardized on VMware, and was looking to build its disaster-recovery site using only virtualized servers.

“Virtualization let us reduce our overall capital investment in server hardware,” Reynolds says. In addition, the firm replaced direct-attached storage with network-attached-storage appliances from Network Appliance for high-density storage and internal redundancy.

In fact, the new data center is about the same size as the old one, but it offers 60 per cent more capacity because it requires fewer servers to support the same number of applications. “Plus, it’s greener. We’re saving on power and cooling because there are fewer servers, so there is less heat output. We wanted to make sure we leveraged those gains at the DR site as well,” Reynolds says.

Reynolds decided to do a proof-of-concept test with InMage using the firm’s Menlo Park, Calif., office as a temporary disaster-recovery site. Because of virtualization, he was able to use just six hardware servers to support 16 critical business systems. He also streamlined costs by using six disaster-recovery servers and one Network Appliance server he retired a year early from his main data center. All that, together with one InMage appliance for the host site and one for the disaster-recovery site cost just US$200,000 to implement. “The cost of the DR-Scout devices was less than our CommVault [tape backup] vendor, so the math wasn’t hard,” he says.

In addition, the benefits quickly became apparent. Rather than relying on tape backups that may or may not recover well, Howard Rice now has nearly instantaneous, bulletproof backup and recovery capabilities. Reynolds’ staffers spend at least 30% less time managing backups and chasing performance problems. It now takes just one engineer a few hours per week to monitor backups and test recoveries. With the success of the proof-of-concept, Howard Rice now is negotiating with a professional collocation provider in northern California, away from the Bay Area quake zone.

“In the past, recovery times were incredibly variable because we would have to go to the offsite vendor, get the tapes, and that would take a day,” Reynolds says. “And then we’d need to get the hardware to restore it to, and once we restored, we had to handle the configuration. Not only would we have [operating system] configuration issues, but also application issues, because inevitably data would be restored to a server with a different name, and apps like SQL Server need to know this. It was a very complex equation.”

Today, however, recovery is nearly instantaneous. “The data is already at our DR site and all we need to do is issue a few commands to DR-Scout, which we’ve scripted, to bring that data online,” Reynolds says. “And that script can be run from anywhere, as long as you have a valid Internet connection.”

The result is that the firm stays up and productive, even when unforeseen glitches occur. “While we have internal redundancy in our servers, and we have dual power supplies and internal redundancy within the drives, we sometimes have issues,” Reynolds says, citing a recent server-memory problem. “Now, instead of fighting the server and maybe having a half day of downtime, we just point people to our DR site. For me as an executive overseeing technology, that’s a nice card to pull out when you need it. We’re a company that’s based on billing time and billing knowledge, and when we can’t do it, there’s a cost factor. With this, we don’t have to pay it.”

For tips on designing a backup plan click here.

Cummings, a freelance writer, can be reached at jocummings@comcast.net.

Comment: edit@itworldcanada.com