Data Quality Part 5: An Additional Component ─ Sensitivity

Posted: August 30th, 2023

Authors: Gene Y. Aditya S.

We started this journey with the following thought: Defining data quality and implementing a data quality program furthers the goal that the data collected serve the intended purpose, i.e., informed decision making. In this article, we’ll explore “sensitivity.” This one seems obvious; we want a measurement that can provide meaningful data at the point in the measurement that is meaningful to us. Ok, Gene, help me out again – what? Let’s do another anecdote and an analogy.

I have a friend who, in a previous life, was a plant manager at a methamphetamine manufacturing facility. Obviously, it is very important, and of great interest to lots of folks beyond the manufacturers, that they account for all the methamphetamine, all the time. They produce their product into 20‑kilogram batches, and periodically remove a small sample (1-5 grams) to do analysis. There was some regulatory audit, and the auditors wanted to know where the small sample came from. You can see where this is going: we took 1 gram out of the 20-kilogram barrel, the barrel still weighs 20 kilograms, why doesn’t this add up. We know that this is all about measurement error; that 1 gram is undetectable with the tools that were used to measure the 20-kilogram drum. There’s kind of two ways to look at this: 1) the 1-gram sample is negligible (within the measurement error) of the 20-kilogram batch, or 2) the method used to measure 20 kilograms is not sensitive enough to answer the question (which is inventory control to the nearest gram). This story highlights sensitivity of measurement, and the question of what a number means, in terms of range, or imprecision, or whatever. We use a shortcut called “significant figures” to try to convey that information. This is a bit of a rabbit hole, and I’m not going there in detail, but let’s think about what “20 kilograms” means: Is it 20 but not 21? Is it 20.00, but not 20.01? And, as a shortcut to understanding and defining the statistical values, we use significant figures. Within the convention of “significant figures,” “20” is different from “20.00.”

Another analogy: I buy a new power tool, which requires assembly, and I drop (and lose) one of the screws required to assemble it. Hi-ho, hi-ho, I’m off to Home Depot 🎵. But let’s look at this question in a little more detail:

Where is my replacement screw: Home Depot
Where is Home Depot: In the shopping center at the Braker exit from MoPac
Where is Home Depot: At the north end of the shopping center
Where is the door to Home Depot: South facing side of the building
Where are the screws: Aisle 17
Where are the specialty screws: In the specialty screw drawers in Bays 8 and 10
Where is my replacement screw: Second column from the right, third drawer down
Where is my replacement screw: Far right side of the drawer, halfway back.

It’s kind of a silly analogy, but it demonstrates the point of having a sensitive enough measurement. And of asking the questions within the correct range (doesn’t do me any good to look for a specialty screw in the plumbing aisle).

And if you’ve made it this far, here’s are a couple cool links showing relative scales:

Let’s unpack that screw analogy a little bit. As we can see, the question: “where is the screw?” has lots of correct answers. And each of them is useful at some point (other than maybe “lost under my workbench”). Let’s see if we can relate this to an environmental measurement question: “how much chromium is in the groundwater?” This question can likewise have lots of answers. If the true value is 21.4 ppm, then the following are all true as well: <100, 20±10, 21±1, >1. If they’re all correct, how do we choose? And remember, we’re not choosing from these answers, we’re choosing a set of measurement tools and data quality objectives. So we go back to the beginning: what’s the underlying question? We need to know the intended data use and need to define the measurement approach. And then to define the quality activities overlaid on the measurements. If the question is: “Is chromium below 150 ppm?” then three of those results support the conclusion of “yes” and one doesn’t answer the question. If the question is “Is chromium below 150 ppb?”, three of those results support the conclusion of “no” and one doesn’t answer the question.

The end game for this quality component, and actually for the whole measurement system, is that the measurement needs to be sensitive enough to tell a “pass” from a “fail.” It follows that the measurement system needs to be most robust at the point of “pass/fail.” Notably, if the measurement is most robust at the point of “pass/fail” then the measurement must be less robust at values that are of no interest to “pass/fail” and the data generated away from the robust point of “pass/fail” are less certain. Said another way, properly executed measurements are valid and usable for their intended purpose and may not be usable and valid for other purposes.

Let’s talk about “data mining”. Data mining is the idea of digging through previous measurements for a new purpose. And this gets real thick, real quick. So, one last analogy. Let’s think about my lost screw. Before I go to Home Depot, I’m going to go check my handy-dandy collection of leftover screws. Maybe I find the exact thing I’m looking for (same thread, same head, same color). Maybe I find one that will work but isn’t quite right (longer, wrong color). And maybe I find one that’s actually a little bit poor (shorter, wrong head shape). And maybe I don’t find anything that works at all. When I “mined” for a screw, I didn’t have any control over what I might find; the only control I have is to understand the limitations relative to my final need.

Switching back to a measurement, presumably, any measurement in an existing data set is robust at the pass/fail point of the original investigation, and of course, less robust at any other point. If our new purpose is interested in a different level (either a different absolute value, or a different level of data quality), then the existing (mined) data is less robust (and less useful) at this new level. The uncertainty in this mined data set is necessarily greater than it would be if we had defined a measurement program (as opposed to a mining program). And it might be unacceptable (screw too short) and inappropriate (screw with wrong head or wrong color) for the purpose, the intended use, the informed decision-making that triggered the data mining exercise in the first place. Bottom line, data mining has to be done carefully, with an eye toward the original intended use, and the new data use.

Next Time: Wrapping a big bow around data quality

Until then, feel free to contact either of us: