|
|
|
TRADE-OFF STUDY SAMPLE SIZE: HOW LOW CAN WE GO?
By Dick McCullough
Download This Article
Table 1.
| |
|
Attr |
|
Lvls |
|
Pars |
|
Tasks |
|
df |
|
SS |
| CBC/HB |
|
|
|
|
|
|
|
|
|
|
|
|
| Data Set 1 |
|
4 |
|
14 |
|
11 |
|
8 |
|
-3 |
|
612 |
| Data Set 2 |
|
6 |
|
17 |
|
12 |
|
18 |
|
+6 |
|
422 |
| Data Set 3 |
|
5 |
|
25 |
|
21 |
|
12 |
|
-9 |
|
444 |
| CVA,HB-Reg |
|
|
|
|
|
|
|
|
|
|
|
|
| Data Set 1 |
|
6 |
|
24 |
|
19 |
|
30 |
|
+11 |
|
2,400 |
| Data Set 2 |
|
4 |
|
9 |
|
6 |
|
10 |
|
+4 |
|
431 |
| Data Set 3 |
|
6 |
|
13 |
|
8 |
|
16 |
|
+8 |
|
867 |
| ACA,ACA/HB |
|
|
|
|
|
|
|
|
|
|
|
|
| Data Set 1 |
|
25 |
|
78 |
|
54 |
|
|
|
|
|
782 |
| Data Set 2 |
|
5 |
|
24 |
|
20 |
|
|
|
|
|
500 |
| Data Set 3 |
|
17 |
|
63 |
|
47 |
|
|
|
|
|
808 |
Notice in Table 1 above that the number of parameters and number of tasks are somewhat correlated with trade-off technique. CBC/HB data sets tended to have fewer degrees of freedom (number of tasks minus the number of parameters) than CVA data sets. ACA data sets had a much greater number of parameters than either CBC/HB or CVA data sets. These correlations occur quite naturally in the commercial sector. Historically, choice models have been estimated at the aggregate level while CVA models are estimated at the individual level. By aggregating across respondents, choice study designers could afford to use fewer tasks than necessary for estimating individual level conjoint models. Hierarchical Bayes methods allow for the estimation of individual level choice models without making any additional demands on the study's experimental design. A major benefit of ACA is its ability to accommodate a large number of parameters.
For each data set, models were estimated using a randomly drawn subset of the total sample, for the sample sizes of 200, 100, 50 and 30. In the cases of ACA and CVA, no new utility estimation was required, since each respondent's utilities are a function of just that respondent. However, for CBC/HB, HB-Reg and ACA/HB, new utility estimations occurred for each draw, since each respondent's utilities are a function of not only that respondent, but also the "total" sample. For each sample size, random draws were replicated up to 30 times. The number of replicates increased as sample size decreased. There were five replicates for n=200, 10 for n=100, 20 for n=50 and 30 for n=30. The intent here was to stabilize the estimates to get a true sense of the accuracy of models at that sample size.
Since it was anticipated that many, if not all, of the commercial data sets to be analyzed in this paper would not contain holdout choice tasks, models derived from reduced samples were compared to models derived from the total sample. That is, in order to evaluate how well a smaller sample size was performing, 10 first choice simulations were run for both the total sample model and each of the reduced sample models, with the total sample model serving to generate surrogate holdout tasks. Thus, MAEs (Mean Absolute Error) were the measure with which models were evaluated (each sub-sample model being compared to the total sample model). 990 models (5 techniques x 3 data sets x 66 sample sizes/replicate combinations) were estimated and evaluated. 9,900 simulations were run (990 models x 10 simulations) as the basis for the MAE estimations.
Additionally, correlations were run, at the aggregate level, between the mean utilities from each of the sub-sample models and the total sample model. Correlation results were reported in the form 100 * (1-rsquared), and called, for the duration of this paper, mean percentage of error (MPE).
It should be noted that there is an indeterminacy inherent in conjoint utility scaling that makes correlation analysis potentially meaningless. Therefore, all utilities were scaled so that the levels within attribute summed to zero (effects coding). This allowed for meaningful correlation analysis to occur.
Sample Bias Analysis
Since each subsample was being compared to a larger sample, of which it was also a part, there was a sample bias inherent in the calculation of error terms.
Several studies using synthetic data were conducted to determine the magnitude of the sample bias and develop correction factors to adjust the raw error terms for sample bias.
Sample Bias Study 1
For each of four different scenarios, random numbers between 1 and 20 were generated 10 times for two data sets of sample size 200. In the first scenario, the first 100 data points were identical for the two data sets and the last 100 were independent of one another. In the second scenario, the first 75 data points were identical for the two data sets and the last 125 were independent of one another. In the third scenario, the first 50 data points were identical for the two data sets and the last 150 were independent of one another. And in the last scenario, the first 25 data points were identical for the two data sets and the last 175 were independent of one another.
The correlation between the two data sets, r, approximately equals the degree of overlap, n/N, between the two data sets (Table 2).
Previous Page | Next Page
MACRO CONSULTING
Intelligent Decisions
|