Currently this page is for archiving common questions from my classes.
I will redesign this page for better viewing experience soon. Thank you for your patience.
If I set the level (-, +) of one factor much wider than the rest of the factors, won’t the effect of that factor increase? When I don’t know the appropriate level setting in the beginning, what do I do?
Level setting should be representative and practical. Normalization (-1,+1) will take care of this to some extent. If the level setting for one factor is in a relative sense much bolder than the level setting for other factors, it will appear that factor is more significant. For screening designs, My advice has always been to error on the side of making sure all factors are set as bold as possible (This decreases the likelihood of imbalance of level setting across factors). One test of the boldness of level setting is if in your predicted response to a factor showing up as insignificant you suggest setting the levels farther apart in the next study….you should have set them farther apart originally. So predictions can help! This also depends on what you are trying to do with the DOE. If you are trying to explain a phenomena that is already occurring, then the representative level setting is a good idea. If you are in early design or looking for solutions, the representative is not necessary and the advice should be bold but reasonable.I would caution however that if the level setting is really bold, you may have trouble with non linearity, which could make the factor appear non significant when really it is. So the reasonable part of “bold but reasonable” is also important. That is where predictions are very useful – but tricky if prior knowledge is very low.
R.V. Lenth (1989). “Quick and Easy Analysis of Unreplicated Factorials,” Technometrics, 31, 469-473.
NIST Handbook, Half Normal Plot, http://www.itl.nist.gov/div898/handbook/pri/section5/pri598.htm
Daniel, C., “Use of Half-Normal Plots in Interpreting Factorial two-level experiments”, Technometrics, v1, No.4, Nov 1959, p311-341.
- Experimental optimization is an iterative process; that is, experiments conducted in one set of experiments result in fitted models that indicate where to search for improved operating conditions in the next set of experiments. Thus, the coefficients in the fitted equations (or the form of the fitted equations) may change during the optimization process. This is in contrast to classical optimization in which the functions to optimize are supposed to be fixed and given.
- The response models are fit from experimental data that usually contain random variability due to uncontrollable or unknown causes. This implies that an experiment, if repeated, will result in a different fitted response surface model that might lead to different optimal operating conditions. Therefore, sampling variability should be considered in experimental optimization. In contrast, in classical optimization techniques the functions are deterministic and given.
- The fitted responses are local approximations, implying that the optimization process requires the input of the experimenter (a person familiar with the process). This is in contrast with classical optimization which is always automated in the form of some computer algorithm.
Simply, the number of elements in a numerical array that you’re allowed to change so that the value of the statistic remains unchanged.
# for instance if: x + y + z = 10
you can change, for instance, x and y at random, but you have no choice (no freedom) about what it changes to. You can change any two values, but not the third, ’cause you’ll change the value of the statistic (Σ = 10). So, in this case df = 2.
A saturated model has as many estimated parameters as data points. By definition, this will lead to a perfect fit (R^2=1), but will be of little use statistically, as you have no data left to estimate variance.
For example, if you have 8 data points and fit a 7th-order polynomial to the data, you would have a saturated model (one parameter for each of the 7 powers of your independent variable plus one for the constant term).
DOE and regression use the same math for the analysis, and use the same diagnostics to check the residual assumptions. The key difference between DOE and generic regression is DOE involves control of the factors that generate changes in the responses. Regression is simply finding a mathematical relationship between variables. Ideally, in a DOE, one tries to keep the changes in the factors independent of each other. Regression methods without the benefit of design and forethought will have less power than a comparable controlled, designed experiment.
If you only have historical data and want to see CORRELATION, use Regression.
If you can change parameters and want to predict your process (by understanding CAUSATION), use DOE.