Over-optimism in bioinformatics: an illustration

M. Jelizarow, V. Guillemot, A. Tenenhaus, K. Strimmer, A.-L. Boulesteix, submitted 2010

Contact: Monika Jelizarow, Anne-Laure Boulesteix

  • The Additional File 1 giving more detailed information on linear discriminant analysis, the shrinkage estimator including the choice of the covariance target and the analytical determination of the optimal shrinkage intensity (Schäfer and Strimmer, 2005) is available here.
  • The Additional File 2 providing an extensive overview of the error rates, the sensitivity and the specificity obtained for all methods discussed in this paper is available here.
  •  The ZIP-file containing the pre-processed Singh (prostate cancer) and Wang (breast cancer) data sets used in this paper can be downloaded here.  The raw data are available from GEO.
  • The ZIP-file containing the R scripts (e.g. the R code performing the LDA) as well as further necessary functions to reproduce the results presented in the paper (separately for each data set) can be downloaded here. Note that loading the necessary functions is part of the data-specific R scripts such that the user does not have to load the source code before.
  • The version of the R package CMA used in the paper are available here: CMA_1.5.4.tar.gz (Linux/Unix), CMA_1.5.4.zip (Windows).
  • The version of the R package SHIP used in the paper are available here: SHIP_1.0.1.tar.gz (Linux/Unix), SHIP_1.0.1.zip (Windows).