Fixed time tasks are often the problem, not the solution

One of the lessons that United Airlines learned in the 1960s was that failure rates that increase with age are unusual: their experience was that only about 11% of failure modes followed this age-related pattern. If these statistics are typical, then fixed interval, time-based overhauls and repairs are the wrong solution for nearly nine out of every ten failure modes.

Here is the central problem. If failure rates do not increase with age, then traditional maintenance planning has no effective way of managing a large proportion of equipment failure modes. Designers, planners and maintainers had been misled for decades by the conviction that doing more maintenance, or doing it more often, was the solution to reliability issues.

Applying a hard-time maintenance policy to these failures doesn’t just waste time and effort: it takes away resources from maintenance that really needs to be done. In addition, because most of the maintenance is ineffective, it demotivates operators and maintainers; and invasive, ineffective maintenance can be a significant cause of breakdowns in its own right.

Our salvation is that fixed-time overhaul and replacement are not the only possible ways to maintain equipment function. RCM recognises a number of possible failure management policies.

  • Condition monitoring
  • Fixed time overhaul and replacement
  • Failure-finding
  • No scheduled maintenance
  • Redesign

What RCM provides is a logical method for deciding what maintenance policy is right for every failure mode.

Condition-based maintenance

Condition-based maintenance tasks look for signs of impending failure and schedule further action only if a developing failure—a potential failure—is detected. Here are a few examples of simple condition-based tasks

  1. Monitor a bearing’s vibration and replace it only if it exceeds a set limit
  2. Monitor the composition of gearbox oil, and change it (and possibly the gearbox components) if the metal particle content exceeds an acceptable limit
  3. Check a valve gland for signs of leakage and tighten or replace it if necessary
  4. Use ultrasound to check a pressure vessel for cracks and repair or replace it if necessary

Condition-based tasks are versatile: an applicable task can be applied to failures with any pattern of failure. There is no need to know what the pattern of failure is, or to do controlled tests to find out a component’s life. As long as there is a definite, identifiable potential failure condition with a usable interval before failure, condition-based maintenance will work.
Time-based overhaul and replacement
These are sometimes called “hard time” tasks because they are carried out at fixed intervals, regardless of the equipment’s condition when maintenance is carried out. Hard time tasks can be justified if there is a sharp and predictable upturn in the chance of failure, but it still doesn’t mean that they are the best options. Condition-based maintenance can still be more effective than a hard time task for these reasons.

  1. If there is any chance of failure before the overhaul or replacement is due, an in-service failure will occur with hard time maintenance, but a condition-based task should detect the initial deterioration and prevent the failure from happening.
  2. The hard time task interval is determined by the point at which the chance of failure starts to increase, but it is rare for every failure to occur at the same time; in reality there is a wide spread of failure times. If a part is replaced before it reaches its minimum expected life, then we throw away functional parts that could function for longer. In contrast, condition-based maintenance uses the maximum possible life of each part because parts are replaced when they are known to be failing.

Failure-Finding

Failure-finding is a scheduled, regular check of a protective system such as an alarm or trip to find out whether it is working. This is different from condition-based maintenance, which replaces parts when they show signs of incipient failure. Failure-finding is used to detect failures that have already occurred, and therefore it is only applicable to one type of failure: that of protective devices and other systems that are not normally operational.

No scheduled maintenance

“No scheduled maintenance” is exactly what it says: do nothing at all to prevent failure, and clear up the mess if the failure occurs. In an RCM schedule, “no scheduled maintenance” doesn’t happen because someone forgot to include a maintenance task: it is a positive decision to do nothing.

Doing nothing is exactly the right response if the failure has no safety or environmental consequences and if there is no task that deals with the consequences of failure, or if the cost of doing the task is greater than the benefit of preventing the failure. RCM ensures that the decision to do nothing is based on hard evidence, and that the decision can be reviewed if circumstances change.

Redesign

Perhaps it seems strange that redesign is part of Reliability-centred Maintenance. It is a key part of the risk management process because sometimes maintenance alone cannot deliver the required equipment performance and availability. The only other alternative, No Scheduled Maintenance (doing nothing) is unacceptable if the failure has safety or environmental consequences, or if allowing it to happen would cause severe downtime.

In RCM terms, “Redesign” covers a wide range of one-off changes, including

  • Changing the equipment, perhaps by adding a protective device or redundancy
  • Changing the way that the equipment is operated
  • Improving operator or maintainer training
  • Adding features that would make maintenance feasible or more economical

The advantage of using RCM is that it attempts to use maintenance to deliver the required reliability before resorting to redesign. Since maintenance is usually the more economical option, the resources needed for redesign are directed at the failures where no other option is available.