How to answer the question–“What Sample Size do we need?”– as a R&D scientist.

Before running an experiment, you or someone from Quality will inevitably ask the sample size question.

You could regurgitate this or use this calculator.

However it’s still confusing.

Here’s how to answer it in plain language yet with “confidence”.

So how do **YOU** answer this question?

Transcript:

Here is the most popular question of all time — What should my sample size be?, How many samples do I need, How many runs should I make, How can I be confident in my results so that the P-value is less than 0.05? Well here’s a good news for you; here’s one way to address that question whether it be from quality or QA and so forth as a development scientist when you do not have all the answers. Let’s just say that the answer depends on certain assumptions and there have been lot of the questions where rules of thumb to develop a single answer and even in the past I’ve created spreadsheets to answer this question which turned out to be useless.

So here it is; what do we need to know to answer the sample size question N. Here’s one tail sample size equation, these factors are what goes into that equation. So what do you need? First you need to know the delta, the practical or physical difference and the mean response used to detect meaningful difference. When should I care about the critical attributes or quality attributes, the difference that you’ll be seeing – the delta basically the Y, you need to know that, define that first and then you need to know the population standard deviation, which is unknown because you have not run enough samples studies already in the sampling. So that’s an assumption, how about the Z alpha and Z bête; these are the Alpha risks and the Beta risk, basically what the patient is willing to take and the risk that the companies want to take. These are usually 0.05 and 0.1 is the industry standard, but that’s not the concern, it’s the cost that managing the factor, it could actually be less or more depending on how much risk the patient or the company is willing to take. So in general, the smaller the D the larger the N and becomes, because the bottom. The larger the standard deviation the larger the sample size becomes, of course, at the top.

If the process is out of control if, remember we talked about how we should have test the stability before we run any design experiments of the process, the larger the process is out of control, the larger the sample size you’ll need because you’ll have more noise. The closer to optimum, the larger the sample size N; to lower the risk larger the N; it keeps going up. So there you have it; if someone asks the sample question, sample size, you tell them that you need to know the answers to these to questions on sample size. The best answer for me would be; the sample size depends on the answer of the scientists, who knows the implications of what that process parameter will do or attribute will do in the end. So because this is very important, the delta, you to define that before you run your experiments.