There’s nothing like a space program to give the term “”mission-critical”” that little extra sense of urgency. It isn’t much of a stretch to say that, for the St-Hubert, Quebec-based Canadian Space Agency, all systems are mission-critical — even something as mundane as data storage. Suffice it to say
that, with personnel and millions of dollars of assets in orbit a few hundred kilometers in space, nothing is mundane.
However, the CSA realized it had a problem in 1998, when its systems went down and took an agonizingly long time to come back up. “”At the time, we had various types of back-up software and systems,”” recalls Robert Dominique, the CSA’s manager for storage and Unix systems. “”It was very decentralized, and we started to have problems. The main one was a major crash in early 1998 that took three days to restore.””
With construction of the International Space Station about to begin, and with Canadian astronaut David Williams due to fly the Space Shuttle’s STS-90 mission within the year to oversee a brace of CSA experiments, the agency knew it could ill-afford a repeat. Dominique notes that, as costly and complex
as they knew the project would be, the CSA’s administrators knew they had no choice but to centralize and streamline their storage and back-up systems for the 21st Century.
Hierarchical storage
In many ways, the resulting system is as much a marvel of technology as the Space Shuttle’s remote manipulator arm. Employing state-of-the-art storage and networking technology built and integrated by Dell Canada, the CSA’s storage backbone provides performance, capacity, availability and above all, peace of mind.
“”We have had a lot of success since the project was completed,”” Dominique says. “”We now have almost 100 per cent recovery in a matter of hours.””
The agency’s storage is firmly grounded in the tiered storage concept, although it is only just preparing to deploy hierarchical storage management (HSM) software. First deployed at the main campus in St-Hubert, outside of Montreal, the system prioritizes mission-critical data, like current experiments and space operations, on high-performance networked storage — namely a trio of EMC Clariion NS600 and CX600 10 Terabyte storage towers supplied by Dell, which are upgradeable to 35 TB.
Files and applications are served by 12 Windows, nine Sun and four Linux servers. Back-ups and file archives reside on servers with high-capacity Serial ATA drives in St-Hubert and at the Ottawa office, while older and reference data are backed up to Dell-branded ADIC Scalar 10K 800 network controlled tape libraries at either of the two locations.
According to Dell Canada server brand manager Bryan Rusche, the automated libraries have the advantage of providing a reliable archive medium while liberating IT staff from the mundane chores of shuffling tape. “”Tape is an ideal medium for this,”” he says. “”It’s cheap and has low data corruption. And generational back-ups can be stored off-site.””
The whole system is based on a mixed Ethernet local area network (LAN) and Fibre Channel storage area network (SAN) infrastructure. “”When we finally decided to link all of these systems, it was quite an experience,”” Dominique says. “”We started to do all kinds of fancy things, like direct back-up, so we had to build a more solid SAN.””
Most of the critical storage, including the high-performance disk towers, most of the servers in the St-Hubert server room, back-up arrays and tape libraries is connected to a 2Gb Fibre Channel backbone through Brocade switches. The separate Space Operations system is a 1Gb fabric, linked to the rest of the main SAN through a 2Gb connection. Replication between between Ottawa and the main base is carried over a 100Mb wide area network connection.
The pipe is important
The big data pipes aren’t just a concession to the “”more and faster”” school of IT thought, however. There’s little point to running resource-intensive applications like simulations and modelling and investing in high-capacity, high-performance storage and pumping data through the networking equivalent of a drinking straw.
“”If you’re going to run an application on a database server, all the application intelligence is on the server, but the data — and there could be a lot of it — is on the disk towers,”” Rusche says. “”That’s where the pipe becomes important. As you add servers and more data to the network, you don’t want the network to be the bottleneck.””
Although the CSA’s original upgrade requirement was for faster, safer back-up, having a high-performance SAN running on the principle of tiered storage has produced real operational benefits and efficiencies. “”It’s easier to back up, of course, and it decreases our back-up and restore times, but there’s more to it than that,”” Dominique says. “”Centralized storage is faster to administer, and we’re able to satisfy users with performance, security and availability. Networked storage is the way.””
The Space Agency also “”put a lot of room for expansion in the requirement,”” Dominique says, noting that the project continues to be a kind of work in progress.
Dominique expects rapid growth in the Agency’s storage requirements and has already purchased storage resource management software to provide dynamic allocation and virtualization. The next step will be to move a full-blown HSM system in the near future.