About RCM

Imagine a world in which you know exactly how long every part will last before it fails. When you buy an item of equipment, it arrives with a list of possible failures that tells you how long it will be before each of them happens. Armed with this information, your life as a maintenance planner is a joy: make sure that you order the right parts before a failure happens and get a maintenance crew to replace them before they fail.

You sleep well every night and go to work in the morning knowing that there will be no surprises. Your day will go just as you planned it. Every day.

That probably doesn’t sound much like your life, and it’s definitely not like mine. But during the first half of the twentieth century the core principle for constructing maintenance schedules was this: find out the item’s life, and make sure that you overhaul or replace it before it reaches that age. Simple, regular, time-based maintenance would prevent failures and deliver high availability.

When more complex equipment came into use, the inadequacies of this model became obvious: although some failures did occur after a predictable life, others could happen at any time. A maintenance planner’s life was not a peaceful one.

In the late 1950s, civil aviation realised that the limitations of the age-related failure model were a serious threat to the industry. Real-life studies like that on the Wright Aero R-3350 TC18 engine had shown that more maintenance—or shortened maintenance intervals—did not always improve equipment reliability. In fact, more maintenance often resulted in more failures and lower availability. As a 1961 FAA and Industry Reliability Program reported:

“In the past, a great deal of emphasis has been placed on the control of overhaul periods to provide a satisfactory level of reliability. After careful study, the Committee is convinced that reliability and overhaul time control are not necessarily directly associated topics.”

Equipment maintenance was in crisis. If more intervention was not the answer, then what was?

By the mid-1960s there was a general recognition that different management policies were needed for different failures, and decision diagram methods were developed to select the right task for each failure mode. With the advent of the large and complex Boeing 747 aircraft, the requirement for a robust technique for generating maintenance schedules became critical. A group of airline representatives on the 747 Maintenance Steering Group developed a document called MSG-1: this represented the first attempt to apply RCM concepts.

The Boeing 747 maintenance programme was the first to be developed under this new reliability-centred approach. The results were impressive. For example, United Airlines used 66000 man-hours for 747 structural inspections before its 20000 operating hour main inspection interval, while the simpler Douglas DC-8 aircraft had needed over 4 million man-hours up to the same point. The traditional DC-8 maintenance programme specified scheduled overhaul for 339 items, while the MSG-based DC-10 programme included only seven. The elimination of wasteful time-based maintenance, along with the reduction of unnecessary inventory, had obvious economic benefits. But if costs were reduced, what happened to operational reliability? Perhaps surprisingly, it increased. The result was that far less was spent on maintenance, but reliability was improved because the right maintenance was being done.

The principles learned from the airline industry through MSG-1 and the updated MSG-2 and MSG-3 were brought together by Stan Nowlan and Howard Heap in the book Reliability-centered Maintenance published in 1978. This is the foundation stone of RCM.