Books

Probably the most valuable tip when learning something new, is “what to read?” Here is a (partial) list of the books I found most valuable in my research.

  1. Cox, D. R. Principles of Statistical Inference. A high level, almost philosophical text, on the basic ideas in statistics. Possibly the most important, in this list, but make sure you read it last.

  2. Wasserman, Larry. All of Statistics: A Concise Course in Statistical Inference. A one stop shop for almost all of the fundamental ideas in statistics.

  3. van der Vaart, A. W. Asymptotic Statistics. Much, if not most, of classical statistics is asymptotic. This book can serve both as an introduction, and as a technical reference for the vast realm of asymptotic statistics.

  4. Hastie, T, R Tibshirani, and JH Friedman. The Elements of Statistical Learning. A one stop shop for machine learning, cast in statistical terminology.

  5. Hodges, J.S. Richly parameterized linear models: additive, time series, and spatial models using random effects. Linking random-effects, Bayesian statistics, geo-statistics, and time-series. I wish someone had told me of this book sooner!

  6. Tsybakov, A. B. Introduction to Nonparametric Estimation. My favorite reference on non parametric statistics.

  7. Greene, William H. Econometric Analysis. My favorite reference on the analysis of linear models. The economic orientation adds very useful applied motivation, and an emphasis on causal inference rarely seen in the statistical literature.

  8. Grimmett, Geoffrey R., and David R. Stirzaker. Probability and Random Processes. An excellent introductory textbook to probability and stochastic processes.

  9. Guttorp, Peter, and Vladimir N. Minin. Stochastic Modeling of Scientific Data. Excellent reference on modeling strategies and stochastic processes.

  10. Robert M. Gray. Entropy and Information Theory. As the name suggests, your one stop shop for information theory. I like it better than Cover’s book.

  11. Wilcox, Rand R. Introduction to Robust Estimation and Hypothesis Testing. A very well written exposition of the vast world of robust statistics. A must read with whoever works with actual data, and not only theory.

  12. Robert E. Weiss. Modelling Longitudinal Data. My favorite reference on the topic.

  13. Venables, W.N. and Ripley, B.D. Modern Applied Statistics with S. Everything you need to actually analyze data in R.

  14. Rasmussen, C.E and Williams C.K.I. Gaussian Processes for Machine Learning. The first time I felt I understand RKHS.

  15. Bai, Z., and Silverstein, J.W. Spectral Analysis of Large Dimensional Random Matrices. For random matrix theory.

  16. Allard, D. and Chilès, J.P. and Delfiner P. Geostatistics: Modeling Spatial Uncertainty. As the name suggests.

  17. Kenneth Lange. Numerical analysis for statisticians. An excellent reference on numerical analysis and algorithms.

  18. Solomon, Justin. Numerical algorithms: methods for computer vision, machine learning, and graphics. A complete, and well written exposition of many numerical algorithms.

  19. Nahmias, S. and Cheng, Y. Production and Operations Analysis. One-stop-shop for industrial statistics and industrial engineering.

  20. Cano, E.L. and Moguerza, J.M. and Redchuk, A. Six sigma with R: statistical engineering for process improvement. For statistical process control.

  21. Cox, D.R. and Reid, N. The Theory of the Design of Experiments. For DOE.

  22. Anderson, T.W. An Introduction to Multivariate Statistical Analysis.

  23. The Princeton Companion to Applied Mathematics. For short, readable, and authoritative explanations of EVERYTHING in applied mathematics.

  24. Bryant, R.E., and O’Hallaron D.R. Computer Systems: A Programmer’s Perspective. If you want to know how a computer works, for the purpose of programming (not maintaining it).

  25. Brian Ward. How Linux Works: 2n Edition. A SUPERB reference on Linux, networking, and how computers work. Superb, superb, superb.

  26. Raftery, Adrian E., Martin T. Wells, and Martin A. Tanner. Statistics in the 21st Century. An excellent overview of the state of statistical research at the end of the 20th century. Including excellent topic overviews, and open research questions.

  27. Peter Godfrey-Smith. Theory and Reality. An excellent introduction to the philosophy of science.

  28. Steven Smith. Digital Signal Processing: A Practical Guide for Engineers and Scientists. A comprehensive and application oriented presentation of DSP (turns out it is more than non-parametric regression with complex numbers!).

  29. Peter J. Schreier and Louis L. Scharf. Statistical Signal Processing of Complex-Valued Data: The Theory of Improper and Noncircular Signals . In the case you are wondering, like myself, “why are so many scientists obsessed with complex numbers?”.

  30. Jake VanderPlas. Python Data Science Handbook. Best reference for learning data-oriented Python.

Feel free to use this list, and recommend more entries.

Some other book recommendations worth following: