Proper Procedures, Smart Planning Ensure Assets Remain Up and Running
Piper Alpha started off as an oil platform in 1976 and then later was converted to gas production before a July 6, 1988 disaster left it at the bottom of the North Sea. While well documented and remembered by the industry’s old guard, a series of mistakes cascaded into the platform disaster that left 167 men dead, 61 survivors and a loss of $3.4 billion.
It didn’t have to be that way.
There are quite a few reasons from the disaster starting with the two condensate pumps, designated A and B. On the morning of 6 July, Pump A’s pressure safety valve was removed for routine maintenance. Crew members planned to overhaul the pump, but had not started the work. They temporarily sealed the open condensate pipe with a disk cover. Because the crew did not complete the work by 6 pm, the disc cover remained in place. The on-duty engineer filled out a permit that stated Pump A was not ready and no one should switch it on under any circumstances.
The engineer failed to inform the on-duty custodian of Pump A’s condition . Instead he placed the permit in the control center and manageleft. This permit disappeared. Meanwhile, there was another permit issued for the general overhaul of Pump A; that work had not yet begun.
Just over three hours later, because of problems with the methanol system earlier in the day, hydrates started to accumulate in the gas compression system pipework, causing a blockage. Due to this blockage, condensate Pump B stopped and they could not restart it. As the entire power supply of the offshore construction work depended on this pump, the manager had only a few minutes to bring the pump back online, otherwise the power supply would fail. A search ensued through the documents to determine whether they could start condensate Pump A, which would alleviate the problem.
They found the permit for the overhaul, but not the other more important permit that said the pump must not start under any circumstances due to the missing safety valve. The valve was in a different location from the pump and therefore the permits were in different boxes because the permits were sorted by location. None of those present was aware that a vital part of the machine was notmissing. Documentation was not at workers’ fingertips. The manager assumed from the existing documents that it would be safe to start Pump A.
That was the birth of a catastrophe.
Pump A switched on. Gas flowed into the pump, and because of the missing safety valve, produced an overpressure which the metal lid could not handle. Gas leaked out at high pressure and triggered six gas alarms including the high level gas alarm. Before anyone could act, the gas ignited and exploded.
Understanding Assets
While the Piper Alpha disaster happened 25 years ago and current technology was not around to save it, this is a perfect case for a proven standard operating procedure (SOP) that everyone is aware of, adheres to, and follows. Having all information on all assets in front of managers can help avert a disaster.
“It is about asset information, management,” said Gonzalo Merchan, director of global energy at EMC. “It is all about improving critical processes to minimize those unscheduled shutdowns, to optimize on how you deal with regulatory agencies, and minimize the risk of a catastrophic event. It is critical for companies to be able to manage that process adequately because, if it is not managed properly, it is increasing the risk of a catastrophic event; but also because regulators are demanding you show how you are managing the process. Both from the minimizing risk and ensuring regulatory compliance standpoint, it is critical to show you have a process place to manage these SOPs adequately.”
Easier said than done, though, as often times, like Piper Alpha, paper work just doesn’t cut it. That is why a program that shows all changes and all orders, and is available to users with a simple mouse click or two, becomes a solid safety factor.
“SOPs are critical. Every operation, whether it is maintenance or operations, requires an approved standard operating procedure,” Merchan said. “Companies have difficulty managing those SOPs.” It also isn’t just about a potential disaster looming. It can also deal with everyday nuances of keeping the system up and running. “On an upstream asset, which is remote by default, shutting it down has massive financial implications, as well as significant operational effects when the subsea wells, platforms, hubs are all tied together,” said Steve Elliott, director of Triconex product management at Invensys Operations Management.
“Being able to access the information quickly and receive the latest updated information (is important),” Merchan said. “One of the things users still struggle with is knowledge workers spending 40 to 60%of time searching for information and that increased probability of unscheduled shut down because people cannot find the information; and once they find it they are not sure they are looking at the latest version of the file they need to see. Just providing secure access to the latest information of the electronic form is a very positive thing in terms of decreasing the risk of an unscheduled shut down, which is what users worry about.”
Understanding and being able to view what the system is doing from a control standpoint is vital and remains a top priority on the platform, but other issues arise in today’s Internet age when a safety system is also visible and, in theory, available to the outside world.
Securing Safety
The level of attacks against critical infrastructure installations continues to rise and the pressure to ensure safety systems staying up, running, and available to jump at a moment’s notice is more important now than ever before.
“There are more cyber attack attempts today than there were in the past,” said Luis Duran, safety system product manager at ABB. “But then at the same time, there have been a lot of safety systems that have been put on the networks. So, systems put in place, which were not connected to the network in the past, were not equipped to handle the attacks we are experiencing today. So, they are more vulnerable. If you look at a new greenfield situation, they are equipped with protection in place. Because we are more aware, we are doing more than before.”
The numbers back Duran up. The number of attacks reported to the US Department of Homeland Security (DHS) cyber security response team grew by 52% in fiscal year 2012 that ended in September, according to a report. There were 198 attacks brought to the agency’s attention last year, several of which resulted in successful break-ins.
The energy sector was the most-targeted field, with 82 attacks, and the water industry reported 29 attacks last year. Chemical plants faced seven cyber attacks, and nuclear companies reported six.
Those are only the attacks that we know about, though. There are companies that make the decision not to report incidents, and the majority of cyber attacks go undiscovered, according to industry researchers.
DHS said the nation’s infrastructure is vulnerable. Using a special search engine that finds Internet-connected devices, researchers from security advocacy group InfraCritical located more than 500,000 devices across the country that appeared to tap into key control systems. They brought their list to DHS, which began investigating — and confirmed 7,200 devices on it do appear to link to critical control systems.
“There are more threats than there ever were before,” said Lee Neitzel, Senior Technologist Systems/Project Engineering at Emerson Process Management. “I would not say they are more vulnerable today than they were before simply because there are newer threats; we have gone to much greater efforts to make sure our systems are more secure than they were 10 years ago.”
“The world of process safety has been shaken by the twin development of Internet connectivity and terrorism,” Elliott said. “Cyber attack and terrorism have now moved into the automation and process safety stage, making the landscape of integration far more complicated. Because digital communication technology is at the heart of these advances, the threats will only get greater, especially considering that digital communications extend into the field. Digital technologies, for example, are being deployed from the ground up in the form of smart field devices that send plant health diagnostics data onto the plant-wide network. So the entry points for a ”threat” are becoming more, not less, and the opportunity for an attack to spread through an asset, more not less.”
Integrated or Independent
Securing a safety system plays into the continuing debate of whether these systems should be independent of the control system, interfaced, or integrated into the control system. In short, an integrated control and safety system brings together the process control and safety control systems. The independent system separates the safety instrumented system (SIS) from the process control system. The two independent systems allow for a better opportunity to optimize safety and process controls. An integrated system allows for better communication between the two systems. The interconnection or interface of the two systems allows for direct perconnections between the two via a communications protocol or router or gateway.
“The interfaced systems are more vulnerable because you have to rely on more pieces to block any potential attack,” Duran said. “In the integrated environment you have things that have been pretested and you have factors to secure the system all together. At the end of the day, it is all about how you implement and secure the system at the application level. I think the non-integrated will require more work to be secure. More companies are leaning toward an integrated safety system. The more risk averse oil and gas companies are leaning toward an integrated system.”
“The challenge is to be sure that in the event of an upset condition, one system doesn’t corrupt the other,” Elliott said. “Where safety systems were previously kept separate and interfaced to the system, today’s move toward integrated control and safety systems, in which both reside on the same control network, renders them more exposed and more vulnerable to attack. The move toward common engineering tools for both the control and safety systems means that if a threat enters the system, it can get to areas of the system that weren’t previously available when the control and safety systems were deliberately isolated from each other.”
Security Standard and Safety
When it comes to securing the safety system, there are plenty of standards that do come into play. One of those standards is the WIB security standard, which outlines a set of specific requirements focusing on cyber security best practices for suppliers of industrial automation and control systems.
Launched by Shell and a series of other end users, they focused on vendor capabilities, not just system capabilities.
“They said to the vendors: not only do you need security mechanisms in place, you also need to harden your system when you put them in, and when you maintain it you need to keep that hardening,” Neitzel said.
“There are a number of architectural requirements in the standard about segmentation from a security standpoint between the SIS and the control system, regardless of who made it. If you abide by those requirements, I don’t think it makes any difference who actually built the safety systems and the control system. From an implementation standpoint, if you bought a certified system made by Emerson or Honeywell, Yokogawa, or Siemens, then you will know you will be getting the full security that is applied to the control system; and it will also be applied to the safety system if they have one. The security aspects keep safety traffic separate from control traffic. That is the fundamental objective, as well as the standard protection mechanisms that the WIB standard says the safety systems has to remain separate from a security standpoint, so there would have to have firewalls in between and gateways. So it just reinforces the safety requirement of being separate,” Neitzel said.
Technology and standards set the stage for the next element factoring into the security equation: the human factor. Users need to stay alert and ready for any kind of security intrusion. They also must be able to communicate to ensure that everyone is working from the same script.
As Duran said, no single failure ever caused a major accident, but rather it was a coincidental failure of several barriers. Technology is out there to secure a safety system to a certain degree, but the big issue then becomes the human factor.
Human Touch
“Security is as much a people issues as it is a technology issue,” Neitzel said. “There is no way around that. We can put all the security mechanisms in places as you want, but it is up to the people who run the system to make it secure and keep it secure. Hardening the systems is really a people issue.”
“Effective security consists of three main factors: People, process, and technology,” Elliott said. “People, includes the staff, procedures, skills, competencies, deployment, policies, and training. Process includes security planning, real-time monitoring, incidence response, audit requirement, and near miss investigation. Technology includes performance, reliability, and integration. It is all too easy to focus on the technology and forget about the other two factors, but adequate protection is the sum of the parts.”
The IT industry has been vigilant in carrying out risk and threat assessments from inception to design to deployment, and finally to implementation of IT networks. Security professionals use a series of standards to guide them.
The same can be said for manufacturers like oil, gas and petrochemical producers, who have carried out similar procedures to identify and evaluate potential risks to personnel, equipment and the environment, including structured and systematic safety, risk and threat assessments; Hazard and Operability (HAZOP) analysis; Layer Of Protection Analysis (LOPA); Safety Integrity Level (SIL) determination and validation. Those activities revolve around functional safety standards.
As threats today become more and more malicious, targeting automation systems, their continued evolution shows that the industry needs to fight them with the combined force of IT cyber security and engineering functional safety.
“It is no longer adequate to consider that each possible hazard could come just from equipment failures, fires, floods or other events within the plant/facility. It is now possible that hazards can be initiated from outside of the plant, some of which would never be considered if viewed from just an engineering perspective,” Elliott said. “Even if they have been considered, the analysis might already be out of date because the threat to any system is not a constant. It is a continuous evolution.” OE Review
Gregory Hale is the Editor and Founder of Industrial Safety and Security Source (ISSSource.com). He is also the co-author of the book, “Automation Made Easy,” Everything you wanted to know about automation – and need to ask.”