It’s 4:00 p.m., and Ankith, supervisor of rotating equipment at one of the US natural gas processing facilities, is slowly wrapping up the day.
As he glances over the day’s performance metrics and reviews equipment status, he notices that the site’s monitoring system has detected a deviation on a gas turbine. Specifically, the vibration on the low-pressure rotor spiked from a typical 10 mm/sec to as high as 25 mm/sec.
Vibration is never an easy problem to troubleshoot and such an issue is certainly worrying.
Ankith opened Industrial Canvas in Cognite Data Fusion® and began his investigation into root cause.
First, he pulled a 3D model to verify the locations of the accelerometers probes. Then, he controlled the OEM specifications in the O&M manuals. 25mm/sec was dangerously close to the limit that could cause a forced outage. As each incident can easily lead to more than $300,000 in costs and there typically are +10x of them in any given year, this ability to do root cause analysis for this small segment can easily provide more than $3m in yearly savings.
Is something going on in the lube oil system? Is cooling not working properly? What about the chip detector? Within minutes, Ankith pulled the necessary P&ID diagram and displayed key operating parameters.
What about the latest maintenance reports? When was the last alignment done?
Is the oil in good condition?
A few clicks later, he had a complete picture of all drivers that were affecting the issue.
Good news - Things looked ok.
Ankith’s last check was to compare the vibration on similar equipment and trend historical values in relation to power output. He quickly set up necessary analysis in Charts and finally came to a resolution - a malfunctioning sensor was triggering the alarm! This exact same situation happened a month ago at a nearby facility.
Ankith issued a corrective work order and called it a day. Within 45 minutes he was able to perform all the steps from pulling relevant data together, performing a thorough analysis, finding the root cause, and determining the corrective action.
But it wasn’t always like that.
As an industry veteran, Ankith’s career spanned all the stages of digital transformation from the advent of cloud, to new IoT sensors and instrumentation, to machine learning models that could predict certain aspects of the future. He cut his teeth trying to digitalize his plant’s operations and maintenance workflows.
Ankith’s team had done a number of Proof of Concepts and digitalization initiatives in the past years. They had worked with OEMs, AI startups, and even built a decent-size data scientist team to develop their own algorithms. But the results were underwhelming: They now have several models doing forecasting for critical equipment like turbines, compressors, pumps as well as a simple BI dashboarding to display high-level plant KPIs. But did it bring the change they hoped for?
Well, not really.
Sure, some models are useful and can be helpful indicators to consider in daily O&M activities. Yet, two major issues persist:
- First - these models work on a small subset of data that usually comes from a historian database. Maintenance reports and inspection data are still outside of the decision-making loop.
- Secondly, the inner workings of these models are hard to grasp even for skilled engineers with deep experience in the field. There were cases where algorithms showed a warning and recommended replacing an electrical motor that was still operating properly. In other cases, the algorithm missed certain indicators that were much more urgent.
How industry focused on the wrong side of the O&M work flow
Overall, there are two main approaches to Predictive Maintenance in heavy asset industries.
Original Equipment Manufacturer (OEM) Approach:
With the advent of Big Data and cloud computing, industrial OEMs (Siemens, GE, Honeywell) were first to spot an opportunity ahead of them. With direct access to all the data from the equipment and thousands of subject matter experts in-house, OEMs were in a good position to offer a best-in class predictive maintenance solution.
As Jeff Immelt, former CEO of GE, famously said "If you went to bed last night as an industrial company you're going to wake up a software & analytics company."
Spoiler alert: This didn’t solve the problem. Despite billions of dollars invested, none of the OEMs can boast a leading Predictive Maintenance offering. They all revamped their platforms, went through a few M&As, and eventually shifted the focus.
What was the approach and why didn't it work as expected?
OEMs have the full breadth of the failure modes and expected life of the equipment they manufacture. Their tactics were to provide comprehensive failure mode monitoring to indicate likelihood of a given failure. Take the aforementioned vibration alarm on the gas turbine; here, the OEM would make a catalog of such issues and proactively monitor the drivers leading to it. They even coupled that with a set of performance upgrades to indicate when an upgrade would be needed.
But this approach didn’t work out as planned mainly due to the two reasons:
- Even OEMs struggled to get access to incorporate other relevant data sets used in decisioning. Every project meant point-to-point integration to ERP systems, various customer’s databases and most importantly other OEM’s data. That wasn’t scalable (or profitable) without solving the fundamental industrial data problem.
- Building scalable software is hard in itself. Recruiting talent, defining go-to-market, developing and maintaining software products all turned out to be more difficult endeavors than initially thought.
Digital Software Startups Approach:
Seeing how industrial OEMs struggled to become software companies, many startups spotted a chance to capture the market. Experienced with recommendation algorithms for Netflix or fraud detection cases in the banking sector, they rushed into the world of heavy asset industries.
The approach was simple:
- Gather as much data as possible from historians and various IoT sensors
- Define the healthy operating condition of a given machine
- Perform a statistical analysis on the past data in order to determine factors leading to failures or abnormal operation
- Combine 2 & 3 into a forecasting model (often based on a linear regression) that would detect deviations from the steady state and estimate time to failure
But how did this perform? While there were a couple, mostly isolated (one equipment, one customer) success stories, there is not really a winner in this space as nobody has deployed a fully-scaled solution.
Reasons?
- Purely data-driven methods cannot 100% reflect the intrinsic complexities of heavy machinery. Especially since time series data, although powerful, cannot be the only source of truth for many of the industrial processes.
- Lack of training data, especially for failures, led to imprecise results. While the consumer sector provides large datasets to work with, there is no rich and extensive equivalent in the energy or manufacturing industries.
- Scalability. The effort to convert a single model into repeatable solutions with proper data pipelines and hygiene came out to be too high.
The solution
While Predictive Maintenance has received all the attention from OEMs and software startups alike, there hasn’t been much attention paid to help Ankith and his colleagues when actual alerts or failures occur. Every day, there are a number of alarms and corrective maintenance jobs to be done at every industrial facility in the world, each needing triage, problem understanding, and resolution.
The status quo where engineers spend hours just collecting the relevant information and trying to understand the situation can’t continue. Data must be treated as a first class citizen and be available to help support daily operations.
So what did Ankith do?
After having tried many digital initiatives and scouting the market, Ankith came across Cognite Data Fusion®, the leading industrial data and AI platform. With Cognite’s QuickStart offering, he was able to get a gas turbine fully set up for RCA and corrective maintenance in 4 weeks, and then was able to scale to 5 data sources, and thousands of pieces of equipment across their site in just 6 weeks later. Within 8 weeks, he had onboarded users and was able to realize value with improved productivity and smarter decisioning.
Interested in learning more?
- Visit cognite.com
- Check out our guided demo
- Contact Us today