The ultimate goal of anyone involved in industrial maintenance should be to achieve the optimal, full service life of our machine assets. To do this, we need to change our current maintenance processes, or at least change the way many plants work.
Many companies have some condition-based maintenance program, but they are confused as to why machine failures still occur. There’s nothing wrong with them performing maintenance as appropriate, but that doesn’t stop your machine from breaking down. First, let me explain why condition monitoring works.
The premise behind condition-based maintenance is that most failures provide some warning of the fact that they are about to occur.
interval from P to F
This warning is called a latent failure, which is defined as an identifiable physical condition that indicates that a functional failure is imminent or is occurring.
Functional failure is defined as the failure of an item to meet specified performance standards.
There are many different techniques for measuring and detecting potential failures. You can choose the machine that works best for you and you. For example, if you have a slow-turning transmission, you could use an oil analysis. Common instruments for measuring potential failures are vibration, ultrasonics, oil analysis and temperature, but there are many more.
The earlier a potential fault is detected, the longer the P-F interval. A longer P-F interval means fewer checks are required and, more importantly, more time is required to take any necessary action to avoid the consequences of a failure.
Is this type of condition-based maintenance or condition detection effective?
Yes, because you can avoid downtime and maybe save money.
Failure comes to us in many ways, and obviously we have many ways to combat it. If you detect potential failures early enough, this means you can avoid them. You can schedule downtime for repairs or maintenance. This is not a malfunction, the machine is not stopped, this is not a shutdown. This is cost avoidance and the factory saves lost production due to downtime costs. Avoid downtime, control downtime and schedule maintenance work. This is a victory.
Think about secondary traumatization. The seal may be in the transmission and will cost more to replace. If you don’t catch it and the bearings are contaminated, it becomes time for the gearbox to be overhauled. But if the bearing gets stuck on the shaft, now you have to replace the shaft, maybe more.
The cost of secondary damage can be huge, so condition monitoring does work, and done well, it saves you a lot of time and money. However, there is a problem with condition monitoring, and the same goes for predictive maintenance. We still have machine breakdowns.
Root cause analysis and defect elimination are a must
The definition of insanity is doing the same thing over and over again and expecting different results. Are we crazy if we just keep replacing bearings without finding the cause of the failure?
Are we suffering from fixing the effect without finding the cause? Only fixing the fault/impact is reactive maintenance. A condition-based maintenance program or any program requires a defect elimination process. Typically done through root cause analysis, which is the process of defining, understanding, and solving a problem.
This fishbone diagram (see image below) is a basic tool for root cause analysis. In our day, it was called failure analysis. We know that the “result” is machine downtime, but what is the real “cause” of the failure?
The process is to build a cross-functional team so we can brainstorm the reasons for failure. This is a good idea, but you have to make sure you have someone with direct knowledge of the process being inspected. More than just representing a department. We then have a step-by-step process to find out the real cause of the failure.
But this is just one tool we use, we also use my favorite “5 Whys” method. It’s simply about asking the question why enough times until you get to the root cause of the problem. Of course, you don’t have to limit yourself to just 5 questions, you can ask as many as you like.
These are just two of the tools available. There are other methods such as Failure Mode and Effects Analysis (FMEA). No matter what you use, the key is that you need to eliminate defects as part of your maintenance process.
Eliminating defects eliminates that cause, which will make your machine assets last longer. The idea is to ensure that “you always fix, not never fix”. So when something breaks, you want to make sure it doesn’t happen again so that over time, you reduce the number of failures and increase uptime.
After the defect is eliminated, whether it is overhaul, repair or redesign, the machine needs to be reinstalled. To do this, precision maintenance skills and techniques are required.
Precise maintenance
Precision maintenance is simple, it means working to accepted standards. A tolerance level that you and your team agree on. The tighter the tolerance, the better the result. But you can’t have a tolerance that’s immeasurable.
Precision maintenance means “upskilling.” It’s not just about having the right tools, but also the right training. Its mechanical acceptance criteria, precision balancing, alignment, base flatness criteria, machine stress removal, etc.
Controlling Factors in Machine Life
- Design
The design of the machine will have an impact on the life of the machine. However, in maintenance we often have to accept the design as it is given. Assume this is a pump that is under-designed for the application; this will mean that the pump will begin service in a malfunctioning state because it does not meet the requirements. So obviously, the design has to be right or the inevitable redesign is done. In any failure analysis, the machine design must be reviewed.
- Inspection/maintenance
Machines undergo multiple overhauls throughout their life cycle. It is extremely important to do this correctly. Many companies will outsource this work because they don’t have the equipment. Because one of the biggest problems during the overhaul is contamination. When a machine undergoes an overhaul, the most important aspect is to maintain the OEM specifications for machine fit. The goal is to make it look new again.
- Installation
Installation is key. This is the most critical thing about any machine. A well-designed machine or a well-maintained machine can be ruined by poor installation.
- Debugging
Debugging is actually a continuation of installation. In fact, you should start by checking the installation documentation. I think it should be done by another group, not the reliability group. Every machine is different, so we cannot publish a list, but all OEM operating procedures should be followed. This is where you should measure the thermal expansion when the button is pressed to start the machine so we know if any corrections are needed before putting the machine into service.
While the machine is online, different parameters such as temperature, sound and vibration should be measured as part of your condition-based maintenance plan. These measurements are the baseline you will use to compare new measurements taken throughout the life of the machine. Changes in these results mean the machine is degrading. However, if you understand the root cause well and use precision maintenance techniques in the areas you can control, it should be because the machine is wearing out and has a good long life.