What is too little or too much data to derive custom machine learning models for Engine Shop Visit scrap predictions: The Human to Machine Journey in simple words

KeepFlying

April 3, 2025

I am going to avoid using aviation specific terms in this part of the blog to simplify the message I am aiming to deliver. I’ll elaborate on specific parts and use cases by Engine Types as part of an ongoing blog series.

Let us begin.

When we want to automate a certain task, we expect computers to do them for us! Let’s take a mundane (since it might be boring) task of sorts, say, guessing when a part needs replacement or is deemed beyond economic repair. Imagine a human doing this; obviously he or she would take cues (certain attribute measurements like time in operation, current consumed per minute etc) by observing the part as well as from his or her experience, no doubt. He or she uses some approach to mix up all this information about the specific part to draw out the required knowledge that answers our basic task question ‘how good the part is to continue operating?’. Wouldn’t it be obvious that he or she will become very good at this assessment over time since one gather more and more experience. We humans can transfer our gathered experience and skill to another person too, giving them the advantage of time to master the skill. Eventually a generation of engineers can anticipate the part replacements and scraps even before they examine it; knowledge gives way to intuition and experience gets encoded as an algorithm.

Now let's comeback to our original plot to automate this task using a computer program; a smart one at that. How can we give the cues (now they become data features) that details the part’s circumstance, so the computer understands? How we transfer our experience to the machine? This is what we call as Artificial Intelligence (AI); and it’s a really smart computer program. But knowing what we want doesn’t help to understand HOW WE DO IT! What enables us to implement such a program? One of the answers is obvious, Machine Learning (ML).

Cumulative failure probability distribution by KeepFlying — Figure1: Cumulative failure probability distribution. The 'k' and 'lambda' are parameters that determine the shape and scale (the range) of the distribution. These are ‘learned’ from data using Regression.

ML lacks inherent intuition but can extract knowledge as effectively as humans can, and at times better than one. ML’s learning strategy is simple; make rules for every decision case like saying “if the part operates at an EGT Margin deterioration slope of 3.145 degrees per X FC in a predominantly temperate environment for Y hours across Z thrust ratings, it then has a 87 percent chance to be scrapped during the next inspection”.

There is one catch though, not all the rules can be “programmed”. They are draw from a mathematical equation that can provide the rule for almost all use cases. How can this be? In this particular case it takes the form a cumulative (probability)distribution function (F(t), read as ‘F’ of ‘t’), one that maps the attribute to the required percentage of scrappage; all the model does is compute the value for the given time (time ‘t’ is a variable in the equation).

If I plot the values for the probability of failure (replacement) of a part with the time in operation it will give the graphs in Fig 1. The x-axis is the operational time ‘t’ and the y-axis represents the probability.

Now apart from these, there are two more variables ‘k’ and ‘l’ (read as ‘lambda’) that decides the shape and scale (range of the graph, simply put) respectively; clearly for different values of k itself the distribution takes a different shape. To know the right values (estimate) for ‘k’ and ‘l’ is where the data helps us. This is how the ‘learning’ is done by this model. Fig2 shows a Weibull distribution sample with a particular k and lambda we learn from data. The cumulative distribution (CDF, F(t)) gives the probability values that we see.

A Weibull distribution with shape by KeepFlying — Figure 2: A Weibull distribution with shape, k = 2.4 and scale, lambda = 1. These parameters are learned from data using regression, a simple ML algorithm.

‍ML uses data to consume and compensate real human experience. Let's say our engineer made clear log of each instance he or she inspected a part with the measurements observed and the decision taken. Datapoint such as these sit between MS Excel spreadsheets and discrete systems.

Now the algorithm (ML) has a record of historic instances regarding parts being replaced in different environments under varying operational conditions, E.g., “Stage 4 HPT Guide Vane operating in sandy environment was scraped around 11375 FC”. ML tries to create patterns over lots of such cases. This allows the model to predict for any unknow case for the future.

How can we feel confident about an ML model? There are many metrics that tell you these indifferent capacities. But the basic and intuitive one is model accuracy. It is objective and intuitive. It’s a simple ratio of how many cases were correctly predicted out of the total. Say I have given 200 instances of the Stage 1 HPT Blades with their respective time in operation, and each case my model responded with a probability value.

Now, how can Isay that the value the model gave is correct? For this I compare the value with my historic data as shown in Fig 3. If my value is within the tolerance, i.e., within the acceptable range for parts, as observed historically, then the prediction is right. Accordingly, in our example, if the model gave ‘right’ probabilities for only 170 cases, the accuracy in this case is the fraction 170/200, which is by expressed as, 85%.

prediction by the ML model by filtering it against known historic statistics for the part's performance. KeepFlying — Figure 3: How we decide each prediction by the ML model to be right or wrong objectively when it simply gives us a probability, can be done by filtering it against known historic statistics for the part's performance.

The larger our sample space is, the more confidence we can infer on the model; the more the volume of data, the better will be the accuracy. Fig 4 plots the model accuracy computed as mentioned above with the number of data points used to estimate the model (distribution, in this case) parameters (shape and scale). It is relatively easy to increase this when it is low, but higher the value the tougher it is to improve.

Part level model accuracy increasing by pumping more data KeepFlying — Figure 4: Part level model accuracy increasing by pumping more data into its training (parameter estimation). it becomes tougher to improve after a certain stage called saturation.

This AI engine uses one model for every part we need to predict for, each independent of the other. We see the overall model is a hybrid between the general physics and a data driven parameter estimation. The System level accuracy should be measured by observing its behaviour metric (also called a confusion matrix in ML parlance) as shown in Fig 5.

The example is a trivial as only nine instances are shown, but it will help to demonstrate the objective assessment. The overall accuracy is 6/9 or 66.67 % as the predictions are deemed correct in six occasions. At part level however is varying due to the amount of data pertaining to each part might have been different. ‘Part A’ has the best model with an accuracy of 100%. Part B is the worst model and is only 1/3^ri.e., 33.34 percent accurate.

When it comes to ‘Part C’ we have just 2 instances to draw the conclusion. But this should not inspire confidence, saying it is 50%accurate. The right response here is ‘not enough data’ or ‘to little data’ to compute accuracy from. All it has is a coin toss probability (heads or tails; one might think the model actually is tossing a coin to decide), unless we try out more cases with ‘Part C’.

System level and part accuracy assessment KeepFlying — Figure 5: System level and part accuracy assessment

‍