TECHNICAL NOTE Year : 2011  Volume : 2  Issue : 1  Page : 52 An opensource software program for performing Bonferroni and related corrections for multiple comparisons Kyle Lesack^{1}, Christopher Naugler^{2}, ^{1} Faculty of Medicine, Bachelor of Health Sciences Program, Room G503, O'Brien Centre for the BHSc, 3330 Hospital Drive N.W. Calgary, Alberta T2N 4N1, 2, Canada ^{2} Departments of Pathology and Laboratory Medicine, University of Calgary and Calgary Laboratory Services, C414, Diagnostic and Scientific Centre, 9, 3535 Research Road NW, Calgary AB Canada T2L 2K8, Canada Correspondence Address: Increased type I error resulting from multiple statistical comparisons remains a common problem in the scientific literature. This may result in the reporting and promulgation of spurious findings. One approach to this problem is to correct groups of Pvalues for «DQ»familywide significance«DQ» using a Bonferroni correction or the less conservative BonferroniHolm correction or to correct for the «DQ»false discovery rate«DQ» with a BenjaminiHochberg correction. Although several solutions are available for performing this correction through commercially available software there are no widely available easy to use open source programs to perform these calculations. In this paper we present an open source program written in Python 3.2 that performs calculations for standard Bonferroni, BonferroniHolm and BenjaminiHochberg corrections.
Background When multiple hypotheses are tested in a single experiment, the risk of type I error is increased and with it the risk of promulgating spurious "significant" findings. [1],[2],[3] The likelihood of obtaining a false positive result increases proportional to the number of tests performed. For example, the probability of obtaining at least one false positive result when performing 10 tests is given by [INLINE:1] where P(A) is the confidence level of the test. Although the problems associated with multiple testing are well known, numerous studies still fail to correct their reported Pvalues. For instance, Bennett et al. found that only between 60% and 74% of the neuroimaging articles published in several major journals corrected for multiple comparisons. [4] Similarly, a study performed by Austin et al. also demonstrated that the failure to account for multiple testing resulted in statistically significant, yet implausible results. [5] In both cases the results were no longer significant after correcting for multiple testing. The lack of attention paid to this problem in the pathology literature stands in stark contrast to its recognition in other fields such as ecology where there has been intense interest for over two decades since the seminal publication by Rice. [6] That being said, even within the field of ecology this topic still engenders debate. [7] A systematic exploration of this problem in the pathology literature has not been undertaken; however we have previously reported on a convenience sample of 800 publications from the pathology literature in 2003, of which 37 presented multiple comparisons. Twenty one of these 37 did not attempt to control for increased type I error due to multiple comparisons. [8] One means of reducing the type I error from multiple testing is the Bonferroni correction, which controls the familywise error rate (FWER). The FWER is the probability of type I error among the entire set of hypotheses. The Bonferroni correction is calculated as follows: [INLINE:2] where n is the number of hypotheses tested. There is a lack of consensus as to what actually represents a "family" of statistical tests; however it has been suggested that if it is appropriate to place multiple Pvalues in the same table, it may be appropriate to correct all values in that table for multiple comparisons. [6] Because the Bonferroni correction is conservative with regard to statistical power, other methods of correcting for multiple testing have been developed. Another method that controls for the FWER is the BonferroniHolm correction. [9] The BonferroniHolm correction is calculated as follows: [INLINE:3] where n is the number of hypotheses tested, and k is the ordered rank of the uncorrected Pvalues (from smallest Pvalue to largest Pvalue). Rather than controlling for the probability of one or more type I errors in the entire experiment, some of the more recent approaches to the multiple testing problem have focused on controlling the false discovery rate (FDR) in the experiment. By controlling the proportion of type I errors, this has the advantage of further increasing the statistical power of the algorithm, and is especially suitable when conducting numerous hypothesis tests. [10],[11] The BenjaminiHochberg method [12] is a commonly used way to control the FDR of an experiment. It is calculated as follows: [INLINE:4] where n is the number of hypotheses tested, and k is the rank of the uncorrected P value. Several commercial statistical software packages are capable of performing one or more of these corrections as well as at least one opensource program (GNU R); however the cost of the commercial packages, and the learning curves involved, may discourage researchers from using these programs. Online tools are also available (e.g., http://www.quantitativeskills.com/sisa/calculations/bonfer.htm) but are limited in scope and available options and rely on continued access to the publisher's website. "Bonferroni Calculator" software Using the opensource programming language Python v 3.2, we developed a program capable of performing Bonferroni, BonferroniHolm, and BenjaminiHochberg corrections for any number of Pvalues. The user is prompted for a set of Pvalues and the desired significance (alpha) level. From the main menu the user may choose to display the results of the desired correction to the screen, or to export the corrected P values to the hard disk (text and csv file types). The source code is available free as a supplementary file to this article (which may serve as a literature reference for the program). A copy of the source code may also be obtained by email from the corresponding author. The program requires the free programming language Python 3.2 which is capable of running on Microsoft Windows, MAC OS, and Linux/Unix operating systems. It may be downloaded from http://www.python.org/getit/releases/3.2/. The program is available for free by emailing the senior author at christopher.naugler@cls.ab.ca. Detailed instructions and a FAQ are available at https://sites.google.com/site/christophernaugler/. To use the Bonferroni Calculator software, place the files "Bonferroni Calculator.py" and "Lesack and Naugler.txt" in a folder on your hard drive. In windows, the program will run from the command line by double clicking on the "Bonferroni Calculator.py" icon; however the preferred method is to right click on the icon and select "Edit with IDLE" from the dropdown list. Press F5 to run the software, and then maximize the size of the window. Follow the instructions on the screen. If the option is selected to save the results to files, these will be found in the same folder as the "Bonferroni Calculator.py" icon. The program is also available from the authors as a standalone executable file. References


