“The darkest hour,” so the saying goes, “is often before the dawn.”
Ontario’s York Region can testify to the truth of this observation.
Five summers ago this municipality was one of the regions severely hit by the infamous blackout that gripped southern Ontario and the upper regions of Great Lake-area States – leaving 50 million people in the dark and without power for hours – and in some cases, days. The infamous blackout was caused by a domino-effect power grid surge.
Instead of just lamenting its predicament, York Region found a way to turn it to its advantage.
York Region’s disaster recovery IT leader discusses his worst-case scenario.
The blackout provided the impetus for the Region’s foolproof business continuity plan. This program ensures – should a similar calamity occur – that critical systems can be up and running within 24 hours.
“If you [cannot] keep your infrastructure that supports your primary critical business going, you won’t be there tomorrow,” says Loretta Chandler, manager of emergency management for York Region.
That, she says, is not a risk a government can accept – not with the public expecting water to run when they turn on the taps, and 911 to respond when they pick up the phone.
Chandler shared her plan for business continuity with attendees at the World Conference on Disaster Management in Toronto on Tuesday.
“No one ever expected this could happen and be so widespread in Canada and the U.S.,” the York Region official said.
As a result of the blackout, regional councils and the provincial government threw new funding at emergency management programs.
York is bumping its disaster recovery program into a business continuity initiative and is by now halfway through that process, which is being accomplished in partnership with the region’s IT department.
Since Chandler assumed leadership of the department, the region has been in compliance with provincial standards. The Emergency Management and Civil Protection Act require all municipalities in Ontario to create a plan that includes specific staff positions and annual training of all staff.
To eliminate risk, the region called upon its IT department to create a system that would enable all critical systems to be recovered within 24 hours or less after a disaster.
This recovery deadline was to apply even in what the region described as the “worst-case scenario.”
“Our worst case scenario is if the head office is destroyed at 3:00 pm on a Tuesday and 30 per cent of the staff is lost. What do you do then?” asked Kurt Wintermeyer, the IT lead for disaster recovery planning in the region.
The plan is fundamentally simple: have a recovery location that key staff will respond to and start resuming critical government services.
That means not just having one backup location, but three alternates in total. Actions of a skeleton staff of 17 would be coordinated by a resumption team lead and three coordinators.
Outlined in the plan are all the resources needed to set-up a recovery office.
There are lists made up for both external parties – vendors and partners – and internal contacts such as key staff in various departments. There’s another list of all the applications that need to be recovered, and instructions on how to access them.
“Passwords to critical applications are kept off site in locked boxes,” Wintermeyer says. “You’ll need detailed maps to the location of those.”
To serve a population of nearly one million across six towns, the region runs 250 applications. Largely a Microsoft shop with more than 25 terabytes of regularly backed-up data, the IT department has identified that half of those applications are critical.
“We have to keep these systems up or else there’s big trouble,” Wintermeyer says.
All of this has been planned to be executed – not by IT support staff – but by the various department heads responsible for running services. The business continuity plan has been built assuming that no one will be able to receive IT support for at least 24 hours, Chandler says.
“It’s amazing the people that write plans and never confer with the people who are supplying the key dependencies,” the manager says. “People just expect to snap their fingers and IT will be there, just like electricity.”
It’s important to have a discussion with all people involved in a recovery plan, she adds. By making the resources known to an organization, staff will take more ownership of their plans in the case of a real disaster.
Transforming the business continuity plan from a remote theory to a daily part of the practice is a challenge, Wintermeyer says. To prepare staff to handle the technical systems when they absolutely need to, the region has trained and re-trained.
“It’s a continual battle to make sure that everyone knows what they’re doing,” the technician says.
Despite being halfway through the plan, Wintermeyer is confident his team is on track. They expect to return to the annual international conference in a year or two and collect an award for their work.
In the mean time, residents of York Region might feel a bit more confident that their lights will stay on.