We often – narrow-mindedly – tend to restrict the AI use case discussion in maintenance to the more sophisticated maintenance strategies (Predictive (PdM) or Prescriptive (RxM)), with a focus on predicting equipment failure. Yet, there are other ways AI can benefit maintenance objectives even under more basic maintenance models (reactive and preventive).
While Part 1 of this blog series looked at how more sophisticated maintenance strategies powered by AI can create substantial value for asset-intensive businesses, in this (second) part, I will review the constraints of the advanced maintenance strategies and highlight some pragmatic ways in which traditional maintenance models can still benefit from the use of AI (and digital).
In the third part of this blog series, I will review some common problem definitions and ML techniques used in advanced PdM, and give some tips on how to best get started with AI.
Managing Director, Prosperitree Consulting
June 2020
Recap of Advanced PdM
As shown in Part 1, Predictive maintenance allows the maintenance frequency to be as low as possible to prevent unplanned reactive maintenance, without incurring costs associated with doing too much preventive maintenance. It uses condition-based indicators and alerts to surface maintenance needs only when machines are at risk of breaking down — optimizing maintenance cadence and maximizing machine availability. It uses data from various sources like historical maintenance records, sensor data from machines, and weather data to determine when a machine will need to be serviced. Leveraging real-time asset data plus historical data, operators can make more informed decisions about when a machine will need a repair.
Decisions on when to intervene and perform a maintenance activity based on the condition of the equipment can be rule-based or AI-powered, latter typically denoted as advanced PdM. In latter, advanced AI algorithms learn a machine’s normal data behaviour and use this as a baseline to identify and alert to deviations in real-time. The system will then be able to predict when a breakdown is likely to occur and can run automated root cause analysis to help improve operating and preventive maintenance activities for the future.
Predictive maintenance often allows for the detection of impending failures that could never be detected by human eyes – take, for example, imaging that looks for microcracks in heavy machinery, even while in use.1
Advanced PdM is not for everyone
When evaluating whether Advanced PdM is the right maintenance strategy for a particular asset class or business, there are four areas of considerations that need to be investigated beforehand:
- Business rationale
- Predictability and data availability
- Operational deployability
- Human mindset & behaviour
These factors may either limit the relevance of advanced predictive maintenance for a particular business or asset class, or shall induce some specific design components and transformation planning to ensure realizing the benefits.
1) Business rationale
Predictive maintenance is an investment: establishing the needed processes initially creates costs. Assets and components need to be equipped with the appropriate sensor technology. Especially in remote areas, this might be time-consuming to set up and require significant investments.
Data from various sources must be integrated and transformed so that it can be made available on a suitable platform. Dashboards, email or SMS warning systems must be put in place to coordinate the necessary maintenance efforts. Process experts’ and data scientists’ knowledge is needed to build and maintain a functioning predictive model. Also, personnel needs to be trained to handle the information inflow and interpret alerts correctly.2
The approach has the most potential in case of assets where there are well-documented failure modes with high associated downtime impact, for example in a critical machine on a larger production line. It also works well when it can be applied at scale to a large fleet of identical assets where there is sufficient reliability history to spread the development and management costs with PdM, as in offshore wind farms or fleets of locomotives.3
Where a machine is prone to a narrow range of well-understood failure modes, it is often possible to address a potential problem in a simpler way, for example by monitoring the temperature or vibration of a component against a set threshold, or by consistently and rigorously applying data-driven reliability analysis techniques to address the root causes of failure modes.
The impact from PdM often may turn out to be low because plants operate critical assets with a high degree of redundancy and few single points of failure. If a pump stops unexpectedly, operators can often switch to a backup unit with little impact on production.4
2) Predictability and data availability
The problem has to be predictive in nature; that is, there should be a target or an outcome to predict. The problem should also have a clear path of action to prevent failures when they are detected. Where a machine can suffer hundreds or thousands of different kinds of failures (some of them very rare), it can be impractical to create sufficient models of high-enough quality to adequately predict them all.
The problem should have a record of the operational history of the equipment that contains both good and bad outcomes. The set of actions taken to mitigate bad outcomes should also be available as part of these records. Error reports, maintenance logs of performance degradation, repair, and replace logs are also important. In addition, repairs undertaken to improve them, and replacement records are also useful.5
Without failure data, unsupervised machine learning techniques can be used to identify normal and faulty behaviour. For example, data could be collected from several sensors on an aircraft engine. A dimensionality reduction technique such as principal component analysis (PCA) could then be used to reduce the sensor data into a low-dimensional representation for visualization and analysis. In this representation, healthy equipment data may be centered around a normal operating point, while unhealthy equipment may be seen as moving away from normal conditions.
Also, the recorded history should be reflected in relevant data that is of sufficient enough quality to support the use case.
Some real time data from IoT sensors is the bare minimum and an essential component to true predictive maintenance because unlike the other types of data listed below, it directly monitors asset conditions and allows for up-to-the-minute predictions. But data from IoT devices on the assets themselves is not the only source. The advantage of predictive maintenance is the ability to combine data from a large variety of sources for the most accurate predictions.
If a sensor-based condition-monitoring maintenance strategy is not feasible, external monitoring means such as drones, thermographic cameras, smart pipeline inspection gauges, or measuring trains could help improve maintenance and inspection frequency. These devices are usually not fixed to the assets, thus providing information on asset conditions and/or failure type “from the outside,” eg, through cameras or infrared sensors.6
Remote external monitoring may be used to replace or complement manual asset inspection. Additionally, receiving information on asset conditions might reduce unnecessary trips, e.g. due to missing tools or spare parts, further improving labor productivity thanks to reduced inspection time.
Essentially, any source of data on an asset can augment IoT data and be used to build and test a predictive maintenance algorithm. From here, you can determine which data sources are the best indicators of failure, wear, or breakdown and add or remove features to refine the final model.7
- Data from programmable controllers
- Geographical data
- Manual data from human inspection
- Manufacturing execution systems
- Equipment usage history data
- Static data, like manufacturer service recommendations for each asset
- Building management systems
- Parts composition
- External data from APIs, like weather
Establishing a robust data backbone, however, is a fundamental enabler for advanced maintenance. Most organizations already have systems in place to record maintenance- and reliability-related data, but the effectiveness of such systems can be undermined by poor housekeeping. The same assets or issues may be described in different ways in different systems, for example, making integration difficult. Companies may use free-text fields to record issues or maintenance actions, making automated search or data analysis harder. Or critical data may be inaccessible, locked away in spreadsheets or on paper notes.
Fixing these challenges often depends not on investment in new technology but on the adoption of more rigorous standards for asset identification and data recording. Artificial-intelligence techniques, such as natural-language processing, can help organizations transform poorly organized historical data into a form more suitable for automated analysis.3
3) Operational deployability
Even when it is possible to create models with predictive power, they often work over time horizons that are too short to be useful in the specific asset operation environment. Predicting that a part will fail in two days or two weeks is useful in a truck or machine tool, but it may not help in a plant where shutdowns take several days and maintenance teams require months to plan interventions and source spare parts.6
Another angle is integration of model outputs to day-to-day decision-making and activity routing, aka deployment. One of the biggest stumbling blocks to predictive maintenance is making data flow smoothly from machines to ERP or CMMS systems in order to achieve a high level of security and reliability with a low level of latency.8
The use of descriptive analytics and data visualizations to provide a real-time view of asset health and reliability performance is equally important. Digital performance management automates the generation and presentation of the key metrics and qualitative information that companies use in their reliability programs, such as overall equipment effectiveness (OEE) data or loss reasons. This kind of automation is a surprisingly powerful improvement lever, freeing maintenance staff from the time-consuming and error-prone process of data collection and analysis. And it supports rapid trend identification, fact-based decision-making, and timely intervention, as well as changes in equipment investment, processes, and policies.3
The advantage of Predictive Maintenance over Preventive Maintenance gets tempered in operating environments where change or repair of an equipment requires the shutdown of a substantial part of the operations, leading to significant business opportunity cost. Alternatively, some repair activities may incur high fixed costs (e.g. the safety preparation of repairing a pothole on a road costs 5-10x more than the repair itself). In such circumstances, even if failures are predicted on an equipment level, the maintenance strategy will require a system-wide maintenance scheduling, resulting in a rather preventive (scheduled maintenance) approach on a system level.
4) Human mindset & behaviour
Even with all the business and technological prerequisites in place, successfully setting up an advanced predictive maintenance strategy is an organizational challenge.
New data analytics, management, and interpretation skills are needed within the entire maintenance organization: every maintenance leader and worker needs to be able to interpret the results in order to adjust their way of working accordingly, new skills related to evaluating measurements from external monitoring devices are required in the organization.
Additionally, maintenance strategies and manual inspection schedules have to be adapted according to the monitoring results.3
Maintenance technicians have been following schedule-based maintenance for more than 30 years. So, there is a routine. And, it is very uncomfortable for them to switch to a scheme that is not predictable as to when the work will be done. They need to develop a comfort level that they can trust the data and that the equipment will not break down early.
There is training necessary for technicians as they start to use instruments and diagnostic tools to determine when equipment is not functioning as expected. That requires a higher level of overall knowledge and tools. “When you move into predictive, you are incorporating other features into the system with a level of complexity significantly above that of wrenches and screwdrivers. They need to have good computer skills and proficiency with a range of instruments. Getting there is a big challenge.”9
The Business and Operations sides of the organization should also have domain experts who have a clear understanding of the problem. They should be aware of the internal processes and practices to be able to help the analyst understand and interpret the data. They should also be able to make the necessary changes to existing business processes to help collect the right data for the problems, if needed.
AI beyond failure prediction
We often – narrow-mindedly – tend to restrict the AI use case to predicting equipment failure. Yet, AI, more specifically machine learning (ML), can be used in a variety of other application areas within maintenance. These additional AI use cases either complement the predictive maintenance strategy, or can be used unilaterally, independent of the type of maintenance strategy applied. Hence, even those businesses with substantial and high-impact maintenance function should consider introducing AI capabilities which cannot justify the adoption of advanced predictive maintenance strategy.
Here are a few examples of additional use cases for AI in Maintenance:
1. Maintenance strategy optimization
Building on a large data set of historic asset performance and operational modes, maintenance activities and consequent business and operational impact (time, cost, quality, value), ML models can be used to identify combination of maintenance strategies for different assets and operation modes in order to optimize reliability, operational schedule and maintenance costs, potentially even on an asset-by-asset basis.
2. Maintenance budget and resource planning
Better data means better investment decisions, especially when it comes to the allocation of sustaining capex costs—or avoiding equipment failure by making the right, risk-informed capex decisions. Most asset-intensive companies struggle to set the right level of sustaining capex, as they find it difficult to allocate funds across multiple plants and disparate asset types. Be able to have a fact-based, data-driven discussion about risk and trade-offs, which has led us to spend less overall—and to manage what we do spend more wisely. ML techniques can help to forecast maintenance demand and corresponding costs and resource requirements.
3. Work-flow automation
Paving the way for artificial intelligence (AI) and self-maintenance by optimizing for (and automating) the immediate next steps once predictive systems or reactive environments point to imminent failure, whether this automatically triggers a work order, notifies a technician or certain team, places an order for a replacement part, etc.
For example, a machine could sense that a drill bit is wearing out or an equipment failure triggers a signal to the system about a pump leakage. In either case, the machine can be programmed to identify the incident based on a history of similar signals and can automatically order a new spare part, alert the technical service department to send a field service representative, and forward the purchase request for a new part to the ERP system, all without any human interaction or human-set rules. By automating manual, error prone, labor intensive administrative functions, manufacturers can experience an additional level of efficiency.
4. Activity optimization (asset-based)
AI can be used to identify how to best execute necessary repairs. This means having a process in place to determine the best time to actually remove the asset from service and which additional repairs – if any – should be conducted simultaneously to minimize the cost of having to remove the asset again for a different failure within a short window.
One way of teaching the machine recommend the best repair activities is similar to how recommendation algorithms suggest on-demand movies for viewers or CRM systems recommend best next action for sales agents in other industry verticals. Having ML algorithms understand patterns of successful repair activities from the past through the Maintenance Service database will result in an AI application that can recommend optimal reactive actions for maintenance experts and dispatchers under different prioritization settings.
Ultimately, the goal is to determine a plan of action for exactly when the asset should be taken out of service so as to minimize disruption and loss (both imminent and future) and maximize resources.7
Outputs of a ML model may try to answer:
- Estimated time out of service for assets with similar maintenance issues;
- The likelihood that the asset’s part fails before or after N days;
- Other maintenance tasks that make financial sense to tackle simultaneously while the asset is already out of service;
- Optimal logistics for completing the maintenance.
5. Automated and dynamic planning (resource-based)
Automated and dynamic planning, scheduling, dispatching, and routing is a means to minimize extra, unexpected trips. The idea is to provide each maintenance worker with only one job at a time — depending on their location, their skills, the urgency of the job, the time required to finish it, etc. —with the help of smart algorithms. On top of optimizing maintenance workers’ routes this way, the amount of manual planning, scheduling, and dispatching is also reduced. Ideally, the corresponding algorithms are even integrated into the organization’s maintenance workflow support system.3
6. Root cause analysis
Better data also means better root-cause problem-solving. That helps companies to prevent the recurrence of failures, to improve their failure-modes-and-effects analysis (FMEA) processes, and to optimize preventive maintenance plans. ML models inherently define a cause-causality relationship between hundreds of feature variables and the target. As such, these models can provide a detailed explanation of which factors contributed to a specific equipment failure, and can also describe the probabilistic view of how those factors may contribute to a failure in the future.
7. Failure detection
In case of geographically spread-out asset pools (e.g. power distribution networks, road infrastructure) where implementing sensors is not possible or economically viable, failures (e.g. loose power lines, potholes) are often detected by humans whether directly from regular field visits or by reviewing massive video footage (by cameras, drones). With the use of visual AI, where the ML models have been trained to identify failures from images, these detection tasks can be automated, leading to massive cost savings, faster and more frequent assessment cycles, and often better accuracy.
In essence, any business with substantial maintenance activity and maintenance opex baseline should consider the use of artificial intelligence in their respective operations. Identifying the right AI use cases, designing the AI journey and setting the organization up for success in adopting AI is not that simple. Engaging experts in the field of Business AI may be your best short-term action.
Part 1 of this blog series looked at how more sophisticated maintenance strategies powered by AI can create substantial value for asset-intensive businesses.
Part 3 of this blog series reviews some common problem definitions and ML techniques used in advanced PdM, and highlights some tips on how to best get started with AI.
- https://www.fiixsoftware.com/maintenance-strategies/predictive-maintenance/
- https://www2.deloitte.com/content/dam/Deloitte/de/Documents/deloitte-analytics/Deloitte_Predictive-Maintenance_PositionPaper.pdf
- https://www.mckinsey.com/business-functions/operations/our-insights/digitally-enabled-reliability-beyond-predictive-maintenance
- https://www.mckinsey.com/business-functions/operations/our-insights/predictive-maintenance-the-wrong-solution-to-the-right-problem-in-chemicals
- https://docs.microsoft.com/en-us/azure/machine-learning/team-data-science-process/predictive-maintenance-playbook
- https://www.mckinsey.com/business-functions/operations/our-insights/the-future-of-maintenance-for-distributed-fixed-assets
- https://content.dataiku.com/predictive-maintenance?__hstc=186155446.01bd737fd2c249756e22265a06c59784.1591921913595.1591921913595.1591921913595.1&__hssc=186155446.2.1591921913597&submissionGuid=5947c205-9727-4f55-929d-436cef858bf1
- https://www.infoq.com/articles/predictive-maintenance-industrial-iot/
- https://www.mmh.com/article/mro_the_challenges_of_moving_from_preventative_to_predictive_maintenance
About the Author
Dr. Adam Flesch is the Managing Director of Prosperitree Consulting, and a former McKinsey Jr. Partner. He has been in the management consulting space for more than 15 years serving Clients across a wide range of industries on strategy, risk management, and operation effectiveness topics. He is a strong advocate for the wider use of AI in business, from decision-making to front-line automation. He advised an integrated international Oil and Gas company on designing and implementing a lean transformation program across its refinery operations, including the entire maintenance function.
About Prosperitree
Prosperitree Consulting is a boutique strategy consultancy focusing on AI-centered business solutions and traditional management consulting. It helps businesses build future-proof strategies and establish smarter day-to-day decision-making routines while turning their data assets into actionable insights and enhancing their respective capabilities.
0 Comments