Disagreement about iterations
The number of iterations in a bootstrap test refers to the number (n) of times we resample with replacement to calculate the bootstrap average for each individual. The number of iterations can be anything ranging from 10 to 10,000 or more, but generally speaking a larger number of iterations produces more reliable results. Rosenfeld et al. (2017b) provided evidence that a relatively small number of iterations (e.g., 100 iterations) sufficiently yields accurate diagnoses when bootstrapping the P300 ERP. The authors concluded that a smaller number of iterations was effective because the P300 is a robust ERP with a large effect size. However, they also noted that this may not be the case in experiments using other ERPs with smaller effect sizes (e.g., N400). So, Rosenfeld et al., (2017b), suggests that the number of iterations may vary across different applications and may not be “one size fits all”. Understandably, concerns have been presented around whether 100 iterations are truly sufficient for accurate diagnosis, even when using P300 (e.g., Zoumpalaki et al., 2015). Although the evidence provided in Rosenfeld et al., (2017b) was suggestive, large correlations between low vs high iteration tests is not highly robust evidence of adequate precision in diagnostic tests. In our view, the concern around a low iteration bootstrap test regards the reliability and repeatability of the classification results of the test, and the most direct way to evaluate this is to simply repeat the test many times, producing many diagnostic results per individual. Fortunately, repeating a bootstrap-based test can be done easily and requires only an investment in time and computation resources. Moreover, we believe that there are several statistical, methodological and diagnostic advantages for repeating a bootstrap test many times, which is the focus of this paper.