RUN A FEW; PREDICT THE REST

Dave Doehlert

[---------------------------------------------------------------------]

Factorial designs got us started on DOE in the 1920's. They are still valuable to us, but in many cases they are unnecessarily expensive.

Remember that the main idea when you do DOE on a process is: run a few; predict the rest.

The runs in an experiment are the expensive part; the predictions are done by electronic computing and so they are the inexpensive part. We want to minimize the number of runs and from the runs predict the performance of as many combinations of factors as we can, to get the most for our money.

Let's start small where the classical experiment designs of the 1920's are economical.

For two factors, the factorial design is the corners of the square:

     o-----o
     |     |        We run a few: 4
     |     |        We predict the rest: all the points inside
     o-----o        the square.      

We generally don't try to predict outside the square because beyond the region of the data we get into riskier predictions.

The 4-point design above has big leverage: run 4 and predict at thousands of places inside the square.

How is that prediction done? By linear interpolation. That often works well, as you can check for yourself in your own applications: does the prediction (by linear interpolation) at some point inside the square match the data if you also run at that point inside? If not, then curvilinear interpolation is available; it was introduced by Box in 1951. Our comments in this short note apply to both. We will focus for now only on linear interpolation from corner data.

To do linear interpolation from the four corners, we can use:

Y = (b0) + (b1)(x1) + (b2)(x2) + (b12)(x1)(x2) where the b's are computed from the data in such a way that the model above gives the Y at each corner that was observed.

Now step up to 3 factors and the 8 corners of the cube:

                          o----o
                         /|   /|
                        o----o |
                        | o--|-o
                        |/   |/
                        o----o      

Now the linear interpolator is: Y = (b0) + (b1)(x1) + (b2)(x2) + (b3)(x3) + (b12)(x1)(x2) + (b13)(x1)(x3) + (b23)(x2)(x3) + (b123)(x1)(x2)(x3).

Choose b's (by regression analysis) which will compute Y's at the 8 corners that match the data.

At this point budget consideration steps in. Do we really need the complication of that 8th coefficient (b123)? Maybe not. In most industrial problems, experience shows that the three-factor interaction x1x2x3 is probably not needed in the model.

If we do without it, can we spend less money on this project? Yes.

Now (since 1991) we know how to place only 7 trials on the cube in a way that will get us good interpolations using the model above but without the last term.

Dropping from 8 to 7 runs can be a big savings if runs take three months each. But often this savings is trivial and experimenters will go for all 8.

The savings get bigger when you go to more factors:

k factors     All corners     Just enough points*     Savings %
     3              8                  7                 12%
     4             16                 11                 31%
     5             32                 16                 50%
     6             64                 22                 66%
     7            128                 29                 77%
     8            256                 37                 86%
     9            512                 46                 91%
    10           1024                 56                 95%
    11           2048                 67                 97%
    12           4096                 79                 98%      

*Just enough points means the number of points in the design is just equal to the number of coefficients in the interaction model. Fewer points and you could not compute the coefficients at all.

This table is for models with terms like (bi)(xi) and (bij)(xi)(xj), the latter being a two factor interaction (2FI) abbreviated b12. If your problems might need 3FI's (like b123) or 4FI's, etc., then a design can be computed for you.

Return to top of page