By Alizabeth Calder
According to research by Rightscale.com, 91 per cent of us are using cloud-based service providers, and we are spending 24 per cent more on cloud solutions than in 2018 (1).
Cloud and SaaS providers are not immune to issues of skills and procedural gaps that cause problems, but we seldom consider the risk of a provider outage. The cost per hour of a cloud outage can exceed $1 million (2).
For those who missed it, Salesforce had a three-day outage recently (3), making them the new poster child for cloud-provider failure. Interesting that this incident should happen to the company that led the cloud-based-reliance movement in the sales, marketing and contact management space.
The situation seems to have resulted from a failure of either testing or oversight (ie. it was a preventable problem). The result was that all individual Salesforce users in a company were able to see and edit all of their company’s data, regardless of permissions. Although the company has not yet put out official statements, the sequence of events seems to be:
- As a result of a faulty script, users inside a company’s “instance” of Salesforce were re-set to have full Administrator access to all the company’s Salesforce records(4). (To be clear, the problem was contained within each separate Salesforce client company – cross-company access does not appear to have been an exposure.) In today’s world of AI-based cybersecurity tools, one might have expected that any change over-writing the client-defined access privileges should have set off a few bells and whistles.
- When the problem was detected, Salesforce (correctly), in an abundance of caution, restricted all user access5.
- The problem was identified to be a Pardot issue (Salesforce’s marketing enablement platform), so restrictions were (again correctly) applied for all companies operating where any client-companies had Pardot. Non-Pardot customers were eventually cleared of any impact and were re-enabled. (3)
- Salesforce did an admirable job of trying to re-set all customer privileges to their prior state by running scripts to restore access (so client-companies did not have to do the recovery manually).
- Salesforce declared some level of victory by late in the day Sunday, but Monday thousands of customers were still not restored and had to manually re-establish the correct permission levels for all users. (4)
- Salesforce did a solid job of communicating status and progress through the weekend, but by Monday update calls were being delayed and cancelled leaving those final customers in the dark. (4)
To be clear, we can give Salesforce a solid B+ on their response to this incident. There are, however, some interesting object lessons to be taken:
- The new age of communication – Salesforce CTO Parker Harris took to Twitter to apologize – “To all of our Salesforce customers, please be aware that we are experiencing a major issue with our service and apologize for the impact it is having on you”. (5) In the old days when a technology provider really messed up, the CEO apologized to their clients. Is this the new standard for how our service providers address us?
- The new world of dependence – Companies using Salesforce have probably moved to a significant if not complete dependence on this platform to support sales. (I suspect some salespeople are going to go back to keeping a local source of contacts and opportunities, in self-defence.) If this incident isn’t blamed for missing clients’ sales commitments in the next round of results calls, the only reason will be that it did not happen at the end of a month or a quarter. Even then, it may have impacted some business results…
- A new level of folly in critical oversight – Third party oversight, including Cloud and SaaS providers, is a critical part of cyber security governance, but this outage won’t be covered by a standard cyber insurance policy. Yet, according to supply chain research, nearly half of IT leaders lack confidence in business partner security postures and 25 per cent do not evaluate partner cyber security. (2) Responsibility is clear – liabilities have yet to be calculated.
While the liability-dance between Salesforce and its’ customers has yet to play out, business leaders from the CIO all the way up to the Board may need to be a little less blasé about the third-party providers they believe are looking after things.
As an IT executive, the fact that Salesforce owned the problem last weekend certainly made my job easier, but at the end of the day I still own the business’ access to critical systems and capabilities. Thanks for the Twitter, Parker, but I’m waiting for the one that says you’ve added some privileged access monitoring and improved your testing protocols.
________
- https://www.rightscale.com/blog/cloud-industry-insights/cloud-computing-trends-2019-state-cloud-survey
- http://www.tripwire.com/company/research/tripwire-2016-supply-chain-survey/
- https://www.crn.com/news/security/massive-salesforce-outage-resolved-with-gradual-access-restoration
- https://www.theregister.co.uk/2019/05/20/salesforce_outage_continues/
- https://marketingland.com/salesforces-pardot-went-down-for-15-hours-exposing-data-in-the-cloud-261257
- https://www.crn.com/news/security/-major-salesforce-outage-whacks-firm-s-marketing-automation-customers