Controlling the average false discovery in large-scale multiple testing

In this paper, we consider multiple testing procedures in which we simultaneously test a large number m of null hypotheses H_1,\ldots, H_m using the test statistics T_1, \ldots, T_m. The currently used procedure of controlling the false discovery rate (FDR) at a specifi ed level requires that the statistics T_1, \ldots, T_m be either independently distributed or positively related. In practice Ti’s are rarely independent and it is not known how to ascertain the positive relationship between
T_i‘s. In this paper, we propose to control the expected value of the Average False
Discovery (AFD) at some speci ed level. This AFD procedure controls its level
at the specifi ed value independent of how T_i‘s are related. This specifi ed value
can be chosen to control k-FWER or \gammaFWER and even FDR at their respective
specifi ed levels. Using simulation, we compare our proposed AFD procedure with
the FDR procedure. In terms of power and stability, the proposed AFD procedure
has an edge over the FDR procedure, as well as over k-FWER procedure. Two
illustrative examples are given to compare the number of di erentially expressed
genes obtained by the two methods.