LMU Medizinische Fakultät IBE IBE
Home  |  Sitemap  |  RSS RSS  |  www.lmu.de  |  Medizinische Fakultät  |  LMU-Portal
IBE - Institut für medizinische Informationsverarbeitung, Biometrie und Epidemiologie


Lab   Short CV   Research Interests   Publications   Teaching

Prof. Dr. HDR Anne-Laure Boulesteix                       

Associate professor


   Marchioninistr. 15, 81377 Munich (Germany)
   Tel: +49 89 4400-77598
   Fax: +49 89 4400-77491
   E-Mail: boulesteix@ibe.med.uni-muenchen.de

Foto Anne-Laure Boulesteix



2011: French "Habilitation à diriger les recherches", University of Evry Val d'Essonne (near Paris, France)

2005: PhD in statistics at the University of Munich (Germany)

2001: Diploma in Mathematics at the University of Stuttgart

2001: M.Sc. in Engineering at the Ecole Centrale Paris (diplôme d'ingénieur)

1979: Born in Paris (France)


Since May 2012: Associate professor at the Department of Medical Informatics, Biometry and Epidemiology of the Faculty of Medicine, University of Munich (Germany)

2009-2012: Assistant professor at the Department of Medical Informatics, Biometry and Epidemiology of the Faculty of Medicine, University of Munich (Germany)

2008: Visiting professor of biostatistics at the Department of Statistics, University of Munich (Germany)

2007-2008: Statistician at the Sylvia Lawry Centre for Multiple Sclerosis Research e.V. in Munich (Germany)

2005-2007: Post-doc at the University Hospital „Klinikum Rechts der Isar“ of the Technical University of Munich (Germany)

2002 – 2006: Researcher at the Department of Statistics, University of Munich (Germany)


Statistics in bioinformatics and biomedicine including but not limited to:

  • Statistical consulting in medical research
  • Validation of research results
  • Publication bias, false research findings, over-optimism
  • Model selection for multivariable regression
  • Regularization techniques for high-dimensional "omics" data
  • Added predictive value, relative importance of predictors
  • Stability of statistical analyses, resampling techniques
  • Recursive partitioning, random forests
  • Dimension reduction (especially Partial Least Squares)
  • Cross-validation, prediction error estimation

Editorial activities


Submitted methodological papers/technical reports - statistics (5)

  1. M. Fuchs, X. Jiang, A.-L. Boulesteix, 2016. The computationally optimal test set size in simulation studies on supervised learning. Technical Report 189, Department of Statistics, LMU.
  2. H. Seibold, C. Bernau, A.-L. Boulesteix, R. De Bin, 2015. On the choice and influence of the number of boosting steps. Technical Report 188, Department of Statistics, LMU.
  3. A.-L. Boulesteix, R. De Bin, X. Jiang, M. Fuchs, 2015. IPF-LASSO: integrative L1-penalized regression with penalty factors for prediction based on multi-omics data. Technical Report 187, Department of Statistics, LMU.
  4. S. Janitza, E. Celik, A.-L. Boulesteix, 2015. A computationally fast variable importance test for random forests for high-dimensional data. Technical Report 185, Department of Statistics, LMU.
  5. M. Fuchs, R. Hornung, R. De Bin, A.-L. Boulesteix, 2013. A U-statistic estimator for the variance of resampling-based error estimators. Technical Report 148, Department of Statistics, LMU. In revision.

Referred articles (84: in international journals: 79, in international proceedings: 5)

  1. M.E. Dolch, S. Janitza, A.-L. Boulesteix, C. Grassmann-Lichtenauer, S. Praun, W. Denzer, G. Schelling, S. Schubert, 2016. Gram-negative and -positive bacteria differentiation in blood culture samples by headspace volatile compound analysis. Journal of Biological Research - Thessaloniki 23:3.
  2. B. Müller, A. Wilcke, A.-L. Boulesteix, J. Brauer, E. Passarge, J. Boltze, H. Kirsten, 2016. Improved prediction of complex diseases by common genetic markers: state of the art and further perspectives. Human Genetics doi: 10.1007/s00439-016-1636-z.
  3. R. Hornung, A.-L. Boulesteix, D. Causeur, 2016. Combining location-and-scale batch effect adjustment with data cleaning by latent factor adjustment. BMC Bioinformatics 17:27.
  4. A.-L. Boulesteix, R. Hornung, W. Sauerbrei, 2016. On fishing for significance and statistician’s degree of freedom in the era of big molecular data (accepted).
  5. A.-L. Boulesteix, 2016. Which resampling-based error estimator for benchmark studies? A power analysis with application to PLS-LDA. "The Multiple Facets of Partial Least Squares Methods". Editors: Abdi, H., Esposito Vinzi, V., Russolillo, G., Saporta, G., Trinchera, L. (Eds). Publisher : Springer. Companion Website.
  6. G. Casalicchio, B. Bischl, A.-L. Boulesteix, M. Schmid, 2016. The residual-based predictiveness curve – A visual tool to assess the performance of prediction models. Biometrics DOI:10.1111/biom.12455. Technical Report 178 , Department of Statistics, LMU.
  7. S. Janitza, G. Tutz, A.-L. Boulesteix, 2016. Random forest for ordinal response data: prediction and variable selection. Computational Statistics & Data Analysis DOI: 10.1016/j.csda.2015.10.005. Technical Report 174, Department of Statistics, LMU.
  8. S. Rospleszcz*, S. Janitza*, A.-L. Boulesteix, 2016. Categorical variables with many categories are preferentially selected in model selection procedures for multivariable regression models on bootstrap samples. Biometrical Journal 58:652-673. *both authors contributed equally to the paper.
  9. S. Janitza, H. Binder, A.-L. Boulesteix, 2016. Pitfalls of hypothesis tests and model selection on bootstrap samples: causes and consequences in biometrical applications. Biometrical Journal 58:447-473.
  10. R. De Bin, S. Janitza, W. Sauerbrei, A.-L. Boulesteix, 2016. Subsampling versus bootstrapping in resampling-based model selection for multivariable regression. Biometrics 72:272-280.
  11. R. Hornung, C. Bernau, C. Truntzer, T. Stadler, A.-L. Boulesteix, 2015. Full versus incomplete cross-validation: measuring the impact of imperfect separation between training and test sets in prediction error estimation. BMC Medical Research Methodology 15:95.
  12. A.-L. Boulesteix, V. Stierle, A. Hapfelmeier, 2015. Publication bias in methodological computational research. Cancer Informatics Suppl. 5:11-19.
  13. A.-L. Boulesteix, R. Hable, S. Lauer, M. Eugster, 2015. A statistical framework for hypothesis testing in real data comparison studies. The American Statistician 69:201-212. Previous version available at: Technical Report 136, Department of Statistics, LMU.
  14. W. Sauerbrei, A. Buchholz, A.-L. Boulesteix, H. Binder, 2015. On stability issues in deriving multivariable regression models. Biometrical Journal 57:531-555.
  15. A.-L. Boulesteix, 2015. Ten simple rules for reducing overoptimistic reporting in methodological computational research. PLoS Computational Biology 11(4): e1004191.
  16. B.M. Böll, F. Vogt, A.-L. Boulesteix, C. Schmitz, 2015. Gender mismatch in allograft aortic valve surgery. Interactive Cardiovascular and Thoracic Surgery pii: ivv151 (Epub ahead of print)
  17. D. Schwilling, M. Vogeser, F. Kirchhoff, F. Schwaiblmair, A.-L. Boulesteix, A. Schulze, A.W. Flemmer, 2015. Live music reduces stress levels in very low-birthweight infants. Acta Paediatrica 104:360-367.
  18. A.-L. Boulesteix, 2015. On reviews and papers on new methods (letter to the editors). Briefings in Bioinformatics 16:365-366.
  19. A.-L. Boulesteix, S. Janitza, A. Hapfelmeier, K. van Steen, C. Strobl, 2015. On the term "interaction" and related phrases in the literature on random forests. Briefings in Bioinformatics 16:338-345. (open-access pdf).
  20. R. De Bin, T. Herold, A.-L. Boulesteix, 2014. Added predictive value of omics data: specific issues related to validation illustrated by two case studies. BMC Medical Research Methodology 14:117.
  21. R. De Bin, W. Sauerbrei, A.-L. Boulesteix, 2014. Investigating the prediction ability of survival models based on both clinical and omics data: two case studies. Statistics in Medicine 33:5310-5329. Technical Report 153, Department of Statistics, LMU.
  22. A.-L. Boulesteix and Matthias Schmid, 2014. Discussion: Machine learning versus statistical modelling. Biometrical Journal 56:588-593.
  23. C. Bernau, M. Riester, A.-L. Boulesteix , G. Parmigiani, C. Huttenhower, L. Waldron, L. Trippa, 2014. Cross-study validation for the assessment of prediction algorithms. Bioinformatics 30: i105-i112.
  24. M. Kebschull, P. Guarnieri, R.T. Demmer, A.-L. Boulesteix, P. Pavlidis, P.N. Papapanou, 2013. Molecular differences between chronic and aggressive periodontitis. Journal of Dental Research 92:1081-1088.
  25. A.-L. Boulesteix, 2013. On representative and illustrative comparisons with real data in bioinformatics: comment on the letter to the editor by Smith et al. Bioinformatics 29:2664-6.
  26. C. Bernau, T. Augustin, A.-L. Boulesteix, 2013. Correcting the optimally selected resampling-based error rate: A smooth analytical alternative to nested cross-validation. pdf of an older version (technical report). Biometrics 69:693-702. Companion Website.
  27. S. Pildner von Steinburg, A.-L .Boulesteix, C. Lederer, S. Grunow, S. Schiermeier, W. Hatzmann, K.-T. M. Schneider, M. Daumer, 2013. What is the “normal” fetal heart rate? PeerJ 1:e82.
  28. A.-L. Boulesteix, A. Richter, C. Bernau, 2013. Complexity selection with cross-validation for lasso and sparse partial least squares using high-dimensional data. In: Algorithms from and for Nature and Life. Springer, pp. 261–268. Companion Website.
  29. A.-L. Boulesteix*, S. Lauer*, M. Eugster, 2013. A plea for neutral comparison studies in computational sciences. PLoS One 8(4):e61562. * both authors contributed equally to this work. Companion Website.
  30. V. Guillemot, A. Bender, A.-L. Boulesteix, 2013. Iterative reconstruction of high-dimensional Gaussian graphical models based on a new method to estimate partial correlations under constraints. PLoS One 8(4): e60536. Companion Website.
  31. S. Janitza, C. Strobl, A.-L. Bouleteix, 2013. An AUC-based Permutation Variable Importance Measure for Random Forests. BMC Bioinformatics 14:119.  highlyaccessed
  32. F. Vogt, B.M. Böll, A.-L. Boulesteix, E. Kilian, G. Santarpino, B. Reichart, C. Schmitz, 2013. Homografts in aortic position: does blood group incompatibility have an impact on patient outcomes? Interactive Cardiovascular and Thoracic Surgery16:619-24.
  33. M. Oelker, A.-L. Boulesteix, 2013. On the simultaneous analysis of clinical and omics data - a comparison of globalboosttest and pre-validation techniques. pdf (technical report). Proceedings of the 8th Scientific Meeting of the Classification and Data Analysis Group of the Italian Statistical Society, 259-268. Eds.: Giudici, P., Ingrassia, S., Vichi, M. ISBN 978-3-319-00031-2. Companion Website.
  34. C. Hornuss, A. Zagler, M.E. Dolch, D. Wiepcke, S. Praun, A.-L. Boulesteix, F. Weis, C. C. Apfel, G. Schelling, 2012. Breath isoprene concentrations in persons undergoing general anesthesia and in healthy volunteers. Journal of Breath Research 6(4):046004.
  35. A.-L. Boulesteix, S. Janitza, J. Kruppa, I. König, 2012. Overview of random forest methodology and practical guidance with emphasis on computational biology and bioinformatics. pdf (technical report). Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 2:497-503.
  36. C. Bernau, A.-L. Boulesteix, J. Knaus, 2012. Application of microarray analysis on computer cluster and cloud platforms. Methods of Information in Medicine 52:65-71.
  37. A.-L. Boulesteix, A. Bender, J. Lorenzo Bermejo, C. Strobl, 2012. Random forest Gini importance favors SNPs with large minor allele frequency. Technical Report 106, Department of Statistics, LMU. Briefings in Bioinformatics 13: 292-304. Companion Website.
  38. V. Guillemot, M. Jelizarow, A. Tenenhaus, A.-L. Boulesteix, 2011. SHrinkage covariance estimation incorporating prior biological knowledge with applications to high-dimensional data. Technical Report 107, Department of Statistics, LMU. Proceedings of the 58th ISI World Statistics Congress. Companion Website.
  39. W. Sauerbrei, A.-L. Boulesteix, H. Binder, 2011. Stability investigations of multivariable regression models derived from low and high dimensional data. Journal of Biopharmaceutical Statistics 21:1206-1231.
  40. A.C. Pickhard, J. Margraf, A. Knopf, T. Stark, G. Piontek, C. Beck, A.-L. Boulesteix, E.Q. Scherer, S. Pigorsch, J. Schlegel, W. Arnold, R. Reiter, 2011. Inhibition of Radiation Induced Migration of Human Head and Neck Squamous Cell Carcinoma by Blocking EGF Receptor Pathways. BMC Cancer 11:388.
  41. M. Calle, V. Urrea, A.-L. Boulesteix, N. Malats, 2011. AUC-RF: A new strategy for genomic profiling with Random Forest. Human Heredity 72:121-132.
  42. T. Herold, V. Jurinovic, K.H. Metzeler, A.-L. Boulesteix, M. Bergmann, T. Seiler, M. Mulaw, S. Thoene, A. Dufour, Z. Pasalic, M. Schmidberger, M. Schmidt, S. Schneider, P.M. Kakadia, M. Feuring-Buske, J. Braess, K. Spiekermann, U. Mansmann, W. Hiddemann, C. Buske, S.K. Bohlander. An eight-gene expression signature for the prediction of survival and time to treatment in chronic lymphocytic leukemia. Leukemia 25:1639-1645.
  43. A.-L. Boulesteix, V. Guillemot, W. Sauerbrei, 2011. Use of pre-transformation to cope with extreme values in important candidate features. Biometrical Journal 53:673–688. Companion Website.
  44. A.-L. Boulesteix, 2011. Editorial. Briefings in Bioinformatics 12:187-188.
  45. A.-L. Boulesteix, W. Sauerbrei, 2011. Added predictive value of high-throughput molecular data to clinical data, and its validation. Briefings in Bioinformatics 12:215-229.
  46. M. Jelizarow, V. Guillemot, A. Tenenhaus, K. Strimmer, A.-L. Boulesteix, 2010. Over-optimism in bioinformatics: an illustration. pdf. Bioinformatics 26:1990-1998. Companion Website.
  47. C. Bernau, A.-L. Boulesteix, 2010. Variable selection and parameter tuning in high-dimensional prediction. Technical Report 76, Department of Statistics, LMU. COMPSTAT 2010.
  48. A.-L. Boulesteix, T. Hothorn, 2010. Testing the additional predictive value of high-dimensional data. BMC Bioinformatics 11:78. highlyaccessed
  49. S. Eifert, H. Mair, A.-L. Boulesteix, E. Kilian, M. Adamczak, B. Reichart, P. Lamm, 2010. Mid term outcomes of patients with PCI prior to CABG in comparison to patients with primary CABG. Vascular Health Risk Management 6:495-501.
  50. A.-L. Boulesteix, 2010. Over-optimism in bioinformatics research. pdf. Bioinformatics 26:437-439.
  51. D. Doll, L. Keller, M. Maak, A.-L. Boulesteix, J. Siewert, B. Holzmann, K.-P. Jansen, 2010. Differential expression of the chemokines GRO-2, GRO-3 and Interleukin-8 in colon cancer and their impact on metastatic disease and survival. International Journal of Colorectal Disease 25:573-581.
  52. A.-L. Boulesteix, C. Strobl, 2009. Optimal classifier selection and negative bias in error rate estimation: An empirical study on high-dimensional prediction. BMC Medical Research Methodology 9:85. Companion Website.
  53. N. Krämer, J. Schäfer, A.-L. Boulesteix, 2009. Regularized estimation of large-scale gene association networks using graphical Gaussian models. BMC Bioinformatics 10:384.
  54. A.-L. Boulesteix, M. Slawski, 2009. Stability and aggregation of ranked gene lists. pdf. Briefings in Bioinformatics 10:556-568.
  55. W. van Wieringen, D. Kun, R. Hampel, A.-L. Boulesteix, 2009. Survival prediction using gene expression data: a review and comparison. pdf. pdf. Computational Statistics and Data Analysis  53:1590-1603.
  56. M. Slawski, M. Daumer, A.-L. Boulesteix, 2008. CMA - A comprehensive Bioconductor package for supervised classification with high dimensional data. pdf. BMC Bioinformatics 9: 439. highlyaccessed
  57. A.-L. Boulesteix, C. Porzelius, M. Daumer, 2008. Microarray-based classification and clinical predictors: On combined classifiers and additional predictive value. pdf. reprint. Bioinformatics 24:1698-1706.
  58. C. Strobl, A.-L. Boulesteix, T. Kneib, T. Augustin, A. Zeileis, 2008. Conditional variable importance for random forests. pdf. BMC Bioinformatics 9:307. highlyaccessed
  59. N. Krämer, A.-L. Boulesteix and G. Tutz, 2008. Penalized partial least squares with applications to B-splines and functional data. pdf. Chemometrics and Intelligent Laboratory Systems 94:60-69.(accepted).
  60. R. Reiter, P. Gais, M.K. Steuer-Vogt, A.-L. Boulesteix, T. Deutschle, R. Hampel, S. Wagenpfeil, S. Rausen, A. Walch, K. Bink, U. Jutting, F. Neff, W. Arnold, H. Hofler, A. Pickhard, 2009. Centrosome abnormalities in head and neck squamous cell carcinoma (HNSCC). Acta Otolaryngologica 129:205-213.
  61. A.-L. Boulesteix, A. Kondylis, N. Krämer, 2008. Invited comment on: "Augmenting the bootstrap to analyze high dimensional genomic data" by Tyekucheva and Chiaromonte. pdf. TEST 17:31-35.
  62. T. Wichard, S. Poulet, A.-L. Boulesteix, J.-B. Ledoux, B. Lebreton, J. Marchetti, G. Pohnert, 2008. Influence of diatoms on copepod reproduction. II. Uncorrelated effects of diatom-derived alpha, beta, gamma, delta-unsaturated aldehydes and polyunsaturated fatty acids on Calanus helgolandicus in the field. Progress in Oceanography 77:30-44 .
  63. A.-L. Boulesteix, C. Strobl, T. Augustin, M. Daumer, 2008. Evaluating microarray-based classifiers: an overview. pdf. Cancer Informatics 6:77-97.
  64. C. Rimkus, J. Friederichs, A.-L. Boulesteix, J. Mages, K. Becker, H. Nekarda, R. Rosenberg, K.P. Janssen, J.R. Siewert, 2008. Microarray-based prediction of tumor response to neoadjuvant radiochemotherapy of patients with locally advanced rectal cancer. Clinical Gastroenterology and Hepatology 6:53-61.
  65. D. Doll, J. Friederichs, A.-L. Boulesteix, W. Düsel, F. Fend, S. Petersen, 2008. Surgery for asymptomatic pilonidal sinus disease. International Journal of Colorectal Disease 23:839-844.
  66. D. Doll, J. Friederichs, H. Dettmann, A.-L. Boulesteix, W. Düsel, S. Petersen, 2008. Time and rate of sinus formation in pilonidal sinus disease. International Journal of Colorectal Disease 23:359-364.
  67. D. Doll, A. Novotny, R. Rothe, K. Wietelmann, A.-L. Boulesteix, W. Düsel, S. Petersen, 2008. Methylene Blue halves the long term recurrence rate in acute pilonidal sinus disease. International Journal of Colorectal Disease 23:181-187.
  68. W. Dietrich, R. Busley, A.-L. Boulesteix, 2008. Effects of aprotinin dosage on renal function: an analysis of 8,548 cardiac surgical patients treated with different dosages of aprotinin. Anesthesiology108:189-198.
  69. A.-L. Boulesteix, C. Strobl, S. Weidinger, H.E. Wichmann, S. Wagenpfeil, 2007. Multiple testing for SNP-SNP interactions. pdf. Statistical Applications in Genetics and Molecular Biology 6(1):37.
  70. A.-L. Boulesteix and C. Strobl, 2007. Maximally selected chi-square statistics and non-monotonic associations: an exact approach based on two cutpoints. pdf (preprint), Computational Statistics and Data Analysis 51(12):6295-6306.
  71. C. Strobl, A.-L. Boulesteix and T. Augustin, 2007. Unbiased split selection for classification trees based on the Gini index. pdf. Computational Statistics and Data Analysis 52:483-501.
  72. A.-L. Boulesteix and K. Strimmer, 2007. Partial Least Squares: A versatile tool for the analysis of high-dimensional genomic data. pdf. Briefings in Bioinformatics 8(1):32-44.
  73. A.-L. Boulesteix, 2007. WilcoxCV: An R package for fast variable selection in cross-validation. pdf (preprint), Bioinformatics 23: 1702-1704.
  74. W. Dietrich, A. Ebell, R. Busley, A.-L. Boulesteix, 2007. Aprotinin and Anaphylaxis - Analysis of 12403 Exposures to Aprotinin in Cardiac Surgery. The Annals of Thoracic Surgery 84:1144-1150.
  75. R. Napieralski, K. Ott, M. Kremer, K. Becker, A.-L. Boulesteix, F. Lordick, J.R. Siewert, H. Höfler, G. Keller, 2007. Methylation of Tumor-Related Genes in Neoadjuvant-Treated Gastric Cancer: Relation to Therapy Response and Clinicopathologic and Molecular Features. Clinical Cancer Research 13(17):5095-5102.
  76. C. Strobl, A.-L. Boulesteix, A. Zeileis and T. Hothorn, 2007. Bias in random forest variable importance measures: Illustrations, sources and a solution. pdf. BMC Bioinformatics 8:25. highlyaccessed
  77. A.-L. Boulesteix, 2006. Reader's reaction to 'Dimension reduction for classification with gene expression microarray data' by Dai et al (2006), Statistical Applications in Genetics and Molecular Biology 5(1), Article 16, pdf.
  78. A.L. Boulesteix, 2006. Maximally selected chi-square statistics and binary splits of nominal variables, pdf, Biometrical Journal 48:838-848.
  79. A.-L. Boulesteix, 2006. Maximimally selected chi-square statistics for ordinal variables, pdf, Biometrical Journal 48:451-462.
  80. A.-L. Boulesteix and G. Tutz, 2006. Identification of Interaction Patterns and Classification with Applications to Microarray Data, pdf (preprint), Computational Statistics and Data Analysis 50: 783-802.
  81. A.-L. Boulesteix and K. Strimmer, 2005. Predicting Transcription Factor Activities from Combined Analysis of Microarray and ChIP Data: A Partial Least Squares Approach, Theoretical Biology and Medical Modelling 2:23, pdf.
  82. S. Hiendleder, S. Bauersachs, A.-L. Boulesteix, H. Blum, G. Arnold, T. Fröhlich and E. Wolf, 2005. Functional genomics: tools for improving farm animal health and welfare, Rev Sci Tech. 24(1): 354-377, "Biotechnology and animal health", pdf.
  83. A.-L. Boulesteix, 2004. PLS Dimension Reduction for Classification with Microarray Data. pdf (preprint). Statistical Applications in Genetics and Molecular Biology 3(1), Article 33.
  84. A.-L. Boulesteix, G. Tutz and K. Strimmer, 2003. A CART-based approach to discover emerging patterns in microarray data. Bioinformatics 19: 2465-2472, pdf.

Other publications

  1. S. Pildner von Steinburg, D. Chronas, A.-L. Boulesteix, N. Lack, K.T.M. Schneider, 2007. Risikofaktoren für Frühgeburtlichkeit - eine multivariate Analyse der bayrischen Perinataldaten aus 10 Jahren. Zeitschrift für Geburtshilfe und Neonatologie:211.
  2. M. Daumer, M. Scholz, A.-L. Boulesteix, S. Pildner von Steinburg, S. Schiermeier, W. Hatzmann, K.T.M. Schneider, 2007. The normal fetal heart rate study: Analysis plan. Nature Precedings, doi:10.1038/npre.2007.980.1.
  3. N. Henningsen, K. Ott, K. Becker, A.-L. Boulesteix, F. Lordick, J.R. Siewert, H. Hofler, G. Keller, 2007. Polymorphisms in the nucleotide excision repair genes ERCC1 and ERCC2 and association with response and prognosis in neoadjuvant treated gastric cancer patients. Pathology Research and Practice 203:254-259.
  4. A.L. Boulesteix, V. Hösel and V. Liebscher, 2007. Stochastic Modeling for the COMET-assay. Journal of Concrete and Applicable Mathematics 5(1):53:75.
  5. A.-L. Boulesteix, 2005. A note on between-group PCA, International Journal of Pure and Applied Mathematics 19: 359-366.


  1. A.-L. Boulesteix, T. Hothorn, globalboosttest R package. Testing the additional predictive value of high-dimensional data.
  2. A.-L. Boulesteix, MAclinical R package. Class prediction based on microarray data and clinical parameters.
  3. M. Slawski and A.-L. Boulesteix, The GeneSelector Bioconductor package. The term 'GeneSelector' refers to a filter selecting those genes which are consistently identified as differentially expressed using various statistical procedures. 'Selected' genes are those present at the top of the list in various featured ranking methods (currently 15). In addition, the stability of the findings can be taken in account in the final ranking by examining perturbed versions of the original data set, e.g. by leaving samples, swapping class labels, generating bootstrap replicates or adding noise.
  4. M. Slawski, C. Bernau and A.-L. Boulesteix, The CMA Bioconductor package. This package provides a comprehensive collection of various microarray-based classification algorithms both from machine learning and statistics. Variable selection, hyperparameter tuning, evaluation and comparison can be performed in a combined manner or stepwise in a user-friendly environment.
  5. N. Kraemer and A.-L. Boulesteix, ppls R package. Penalized partial least squares.
  6. A.-L. Boulesteix, WilcoxCV R package. Wilcoxon-based variable selection in cross-validation.
  7. A.-L. Boulesteix, SNPmaxsel R package. Maximally selected statistics for SNP data.
  8. A.-L. Boulesteix, exactmaxsel R package. Maximally selected statistics for binary response variables - Exact methods.
  9. A.-L. Boulesteix, Sophie Lambert-Lacroix, Julie Peyre and Korbinian Strimmer, plsgenomics R package. PLS analyses for genomics.


A.-L. Boulesteix, 2011. Habilitation, Université Evry-Val d'Essonne, France.

A.-L. Boulesteix, 2005. phD Thesis: Dimension Reduction and Classification with high-dimensional microarray data, pdf.

A.-L. Boulesteix, 2001. Mathematische Modellierung und Statistik für das COMET-assay (Master thesis at the University of Stuttgart, Department of Mathematics A)



  • 2014: "Comparison of the prediction accuracy of two alternative random forest variants in the case of unbalanced response classes" (bachelor thesis in statistics, supervised by Silke Janitza)
  • 2014: Bachelor thesis in statistics.
  • 2014: Bachelor thesis in statistics.
  • 2014: Bachelor thesis in statistics.
  • 2014: "Investigation of the correctness of statistical analyses in medical journals: the example of statistical tests of independence for contingency tables" (bachelor thesis in statistics)
  • 2014: "Joint analysis of different types of high-dimensional molecular data from leukemia patients" (master thesis in statistics for economics and social science, cooperation with Tobias Herold)
  • 2014: "Evolution of statistical analyses in sport physical therapy studies from 2006 to 2013: systematic review of original research from the Journal of Sports Physical Therapy" (bachelor thesis in statistics)
  • 2014: "Comparison of statistical methods for the analysis of benchmark experiments in relation to datasets' characteristics" (master thesis in statistics for economics and social science).
  • 2014: "Efficient computation of unconditional error rate estimators for learning algorithms and an application to a biomedical data set" (master thesis in statistics, supervised by Mathias Fuchs).
  • 2014: "Pre-validation for assessing the added predictive value of high-dimensional molecular data in binary classification" (master thesis in biostatistics, co-supervised von Riccardo De Bin).
  • 2014: "A literature overview on data preparation and tuning procedures for lasso and PLS in scientific articles" (bachelor thesis in statistics, supervised by Roman Hornung).
  • 2014: "Influential observations and resampling analyses: an empirical study with high-dimensional survival data" (bachelor thesis in statistics, supervised by Riccardo De Bin)
  • 2014: "Influence of the number of folds and iterations in benchmarking with cross-validation: example of classification with high-dimensional gene expression data" (bachelor thesis in statistics)
  • 2014: "Random Forest and interactions: an example of publication bias?" (master thesis in statistics)
  • 2013: "Impacts of centering and scaling of the predictors on the misclassification rate in the Lasso model " (bachelor thesis in statistics, supervised by Roman Hornung).
  • 2013: "Comparison of different procedures for multiple testing in the analysis of volatile organic compounds from different Gram-negative und -positive bacteria and fungi" (bachelor thesis in statistics, cooperation with Michael Dolch)
  • 2013: "Comparison of different procedures for the construction of prediction rules in the analysis of volatile organic compounds from different Gram-negative und -positive bacteria and fungi" (bachelor thesis in statistics, cooperation with Michael Dolch)
  • 2013 (ongoing): "Random Forest and interactions: an example of publication bias?" (master thesis in statistics)
  • 2013: "Real data examples in biostatistics and bioinformatics journals: a survey" (bachelor thesis in statistics)
  • 2013: "The effects of bootstrapping on model selection for multivariate regression in epidemiology" (master thesis in epidemiology, co-supervised by Silke Janitza)
  • 2013: "Maximally selected chi-2 statistics for the analysis of zero-inflated variables" (bachelor thesis in statistics)
  • 2013: "A regression model for the analysis of the effects of datasets' characteristics on the relative accuracy of prediction algorithms with application tp 50 microarray datasets" (bachelor thesis in statistics)
  • 2013: "Comparison of integrated prediction error curves and C-index as error measures in permutation variable importance and for model selection with random survival forests" (diploma thesis in statistics, co-supervised by Roman Hornung)
  • 2012: "On the power of resampling-based comparisons of classifiers with real data" (master thesis in biostatistics)
  • 2012: An AUC-based permutation variable importance measure for random forests for unbalanced data (master thesis in epidemiology)
  • 2012: A random forest approach for the identification of predictors of the daily number of births (master thesis in biostatistics)
  • 2012: Validation of the added predictive value of high-dimensional data (master thesis in statistics)
  • 2012: "Classification using between-class differences of correlation structure" (master thesis in biostatistics)
  • 2011: "Resampling-based variable and object influence in cluster analysis" (master thesis in statistics)
  • 2011:  "Einfluss von Feiertagen auf Geburtenhäufigkeit: Vergleich von verschiedenen statistischen Verfahren" (bachelor thesis in statistics)
  • 2011: "Comparing clusterings of features in two groups of patients" (bachelor thesis in statistics)
  • 2011: "Complexity selection in sparse Partial Least Squares for high dimensional data" (master thesis in biostatistics, co-supervised by Christoph Bernau)
  • 2010: "Variable Importance in Random Forest und Random Jungle mit Anwendung auf SNP Daten" (bachelor thesis in statistics)
  • 2009:  "Regularized discriminant analysis incorporating prior knowledge on gene functional groups" (diploma thesis in statistics)
  • 2009: "Variable selection and parameter tuning in high-dimensional prediction" (diploma thesis in statistics)
  • 2007: "Microarray-based prediction with emphasis on clinical parameters" (diploma thesis in statistics) 
  • 2007: "Instability of methods for the analysis of differential gene expression" (diploma thesis in statistics)  
  • 2007: "Survival analysis with microarray data" (diploma thesis in statistics)  
  • 2006: "Statistical modelling of gene-gene interactions in case-control association studies" (master thesis in statistics)  
  • 2005: "Variable selection bias in classification trees" (master thesis in statistics)


  • Silke Janitza, 2011: Random forests and SNP-SNP interactions (internship epidemiology)
  • Anne Gasnier, 2008: "Modelling and statistical analysis of fetal heart decelerations" (student research project  in cooperation with Ecole Centrale Paris, applied mathematics)
  • Martin Slawski, 2007-2008: "GeneSelector and CMA packages" (internship)


Impressum - Datenschutz - Kontakt