
****************************************************** REPRODUCIBLE FILES TO THE PAPER *******************************************************

"Pitfalls of performing hypothesis tests and model selection on bootstrap samples: causes and consequences in biometrical applications" (2014)



by S. Janitza*, H. Binder and A.-L. Boulesteix






**********************************************************************************************************************************************

* Corresponding author: Silke Janitza
 <janitza@ibe.med.uni-muenchen.de>







********* CONTENTS *********


  
  * folder 'NHANES_data': contans all R-code for reproducing results based on the NHANES data
  * folder 'Simulations': contans all R-code for reproducing results based on simulated data
  * folder 'R_Objects': contans relevant R-objects that have been produced for our analyses and can be reproduced by the readers 
                        (each R-file with ending "_plot.R" loads objects contained in this folder)





********* NOTES BEFORE YOU START *********



R version 3.0.1 was used for all analyses.

Relevant R-objects are available in the folder R_Objects and can be accessed by the user via loading it in the working directory.

The user has to obtain the NHANES data from the website http://www.cdc.gov/nchs/nhanes.htm to reproduce the analyses. For only creating the figures, the data is not needed.

Objects that are created within a R file are stored in the current working directory


.






********* INSTRUCTIONS *********





REQUIRED R PACKAGES
-------------------

   

  R package mboost (version 2.2-3) is required for running R file 'tuning_parameter_selection_AIC_create_Robjects' in folder 'NHANES_data'
 
    R packages snowfall (version 1.84-4) is required for running R file 'bootstrapped_p_values_create_Robjects.R' in folder 'NHANES_data'

     R packages snowfall (version 1.84-4) and mboost (version 2.2-3) are required for running R file 'tuning_parameter_selection_AIC_create_Robjects.R' in folder 'Simulations'    



SECTION 3

---------



  * for reproducing Figure 1 run file 'teststat_LR_test.R' in folder 'Simulations'






SECTION 4

---------



  4.1 Bootstrapping p-values

    
* for reproducing Figures 2-6 or Table 2 run files 'bootstrapped_p_values_create_Robjects.R' and 'bootstrapped_p_values_plot.R' in folder 'NHANES_data'




  4.2 Bootstrapped p-values for Assessing the Variability of p-values


    * for reproducing Figure 7 and 8 run files 'p_value_variability_Z_test.R' and 'p_value_variability_LR_test.R' in folder 'Simulations'




  4.3 Bootstrapping Information Criteria


    * for reproducing Figure 9 and 10 run files 'bootstrapped_AIC_create_Robjects.R' and 'bootstrapped_AIC_plot.R' in folder 'NHANES_data'




  4.4 Application of Bootstrapped Information Criteria for Model Selection


    * for reproducing Figure 11 and 12 run files 'tuning_parameter_selection_AIC_create_Robjects.R' and 'tuning_parameter_selection_AIC_plot.R' in folder 'NHANES_data'

    * for reproducing Figure 13 run files 'tuning_parameter_selection_AIC_create_Robjects.R' and 'tuning_parameter_selection_AIC_plot.R' in folder 'Simulations'



APPENDIX

--------



  * for reproducing Figure A1 run file 'teststat_Z_test.R' in folder 'Simulations'

  * for reproducing Figure A2 run files 'bootstrapped_AIC_create_Robjects.R' and 'bootstrapped_AIC_plot.R' in folder 'NHANES_data'

  * for reproducing results in Table A2 run files 'bootstrapped_p_values_create_Robjects.R' and 'bootstrapped_p_values_plot.R' in folder 'NHANES_data'

  * for reproducing results in Table A3 run files 'bootstrapped_AIC_create_Robjects.R' and 'bootstrapped_AIC_plot.R' in folder 'NHANES_data'
