Rob is Professor of Quantitative Analytics at Wesleyan University. I created this website for both current R users, and experienced users of other statistical packages (e.g., SAS, SPSS, Stata) who would like to transition to R. My goal is to help you quickly access this language in your work. I assume that you are already familiar with the statistical methods covered and instead provide you with a roadmap and the code necessary to get started quickly, and orient yourself for future learning. I designed this web site to be an easily accessible reference. Look at the sitemap to get an overview.
The 1999 edition looks like an antiquity as software and data has evolved so quickly recently. This 2013 edition is just about old statistics. Although many graduate students and researchers have had course work in statistics, they sometimes find themselves stumped in proceeding with a particular data analysis question. In fact, statistics is often taught as a lesson in mathematics as opposed to a strategy for answering questions about the real world, leaving beginning researchers at a loss for how to proceed. In these situations, it is common to turn to a statistical expert, the "go to" person when questions regarding appropriate data analysis emerge.
Remember the ASA statement on p-values from last year? The profession is getting together today and tomorrow (Oct. John Ionnidis, who has published widely on the reproducibility crisis in research, said this morning that "we are drowning in a sea of statistical significance" and "p-values have become a boring nuisance." Too many researchers, under career pressure to produce publishable results, are chasing too much data with too much analysis in pursuit of significant results. The p-value has become a standard that can be gamed ("p-hacking"), opening the door to publication. P-hacking is quite common -- the increasing availability of datasets, including big data, means the number of potentially "significant" relationships that can be hunted is increasing exponentially. And researchers rarely report how much they looked at before finding something that rises to the level of (supposed) statistical significance. So more and more artefacts of random chance are getting passed off as something real.
David A. Freedman presents here a definitive synthesis of his approach to causal inference in the social sciences. He explores the foundations and limitations of statistical modeling, illustrating basic arguments with examples from political science, public policy, law, and epidemiology. Freedman maintains that many new technical approaches to statistical modeling constitute not progress, but regress. Instead, he advocates a shoe leather methodology, which exploits natural variation to mitigate confounding and relies on intimate knowledge of the subject matter to develop meticulous research designs and eliminate rival explanations. When Freedman first enunciated this position, he was met with skepticism, in part because it was hard to believe that a mathematical statistician of his stature would favor low-tech approaches.