While R is the software of choice and the undisputed leader in many fields of statistics, this is not so in econometrics; yet, its popularity is rising both among researchers and in university classes and among practitioners. From user feedback and from citation information, we gather that the adoption rate of panel‐specific packages is even higher in other research fields outside economics where econometric methods are used: finance, political science, regional science, ecology, epidemiology, forestry, agriculture, and fishing.
This is the first book entirely dedicated to the subject of doing panel data econometrics in R, written by the very people who wrote most of the software considered, so it should be naturally adopted by R users wanting to do panel data analysis within their preferred software environment. According to the best practices of the R community, every example is meant to be replicable (in the style of package vignettes); all code is available from the standard online sources, as are all datasets. Most of the latter are contained in a dedicated companion package, pder. The book is supposed to be both a reasonably comprehensive reference on R functionality in the field of panel data econometrics, illustrated by way of examples, and a primer on econometric methods for panel data in general.
While we have tried to cover the vast majority of basic methods and much of the more advanced ones (corresponding roughly to graduate and doctoral level university courses), the book is still less exhaustive than main reference textbooks (one for all, Baltagi, 2013) the a priori being that the reader should be able to apply all the methods presented in the book through available R code from plm and related, more specialized packages.
One should note from the beginning that, from a computational viewpoint, the average R user tends to be more advanced than users of commercial statistical packages. R users will generally be interested in interactive statistical programming whereby they can be in full control of the procedures they use and eventually be looking forward to write their own code or adapt the existing one to their own purposes. All that said, despite its reputation, R lends itself nicely to standard statistical practice: issuing a command, reading output. Hence the potential readership spans an unusually broad spectrum and will be best identified by subject rather than by level of technical difficulty.
Examples are usually written without employing advanced features but still using a fair amount of syntax beyond what would be the plain vanilla “estimate, print summary” procedure sketched above; the reader replicating them will therefore be exposed to a number of simple but useful constructs—ranging from general purpose visualization to compact presentation of results—stemming from the fact that she is using a full‐featured programming language rather than a canned package.
The general level is introductory and aimed at both students and practitioners. Chapters 1–2, and to some extent 4–5, cover the basics of panel data econometrics as taught in undergraduate econometrics classes, if at all. With some overlapping, the main body of the book (Ch. 3–6) covers the typical subjects of an advanced panel data econometrics course at graduate level. Nevertheless, the coverage of the later chapters (especially 7–10) spans fields typical of current applied research; therefore it should appeal particularly to graduate students and researchers. For all this, the book might play two main roles: companion to advanced textbooks for graduate students taking a panel data course, with Chapters 1–7 covering the course syllabus and 8–10 providing more cutting‐edge material for extensions; and reference text for practitioners or applied researchers in the field, covering most of the methods they are ever likely to use, with applied examples from recent literature. Nevertheless, its first half can be used in an undergraduate course as well, especially considering the wealth of examples and the possibility to replicate all material. Symmetrically, the last chapters can appeal to researchers wanting to employ cutting‐edge methods—for which there is usually around only quite unfriendly code written in matrix language by methodologists—with the relative user‐friendliness of R. As an example, Ch. 10 is based on the R tutorials one of the authors gives at the Spatial Econometrics Advanced Institute in Rome, the world‐leading graduate school in applied spatial econometrics.
Econometrics is a late comer to the world of R, although of course much of basic econometrics employs standard statistical tools, which were present in base R. Typical functionality, addressing the emphasis on model assumptions and testing, which is characteristic of the discipline, started to appear with the lmtest package and the accompanying paper of Zeileis & Hothorn (2002); a review paper on the use of R in econometrics, focused on teaching, was published at about the same time (Racine & Hyndman, 2002). This was followed by further dedicated packages extending the scope of specialized methods to structural equation modeling, time series, stability testing, and robust covariance estimation, to name a few; while despite the availability of some online tutorials, no dedicated book would appear in print until Kleiber & Zeileis (2008).
In the wake of any organized and comprehensive R package for panel data econometrics, Yves Croissant started developing plm in 2006, presenting one early version of the software at the 2006 useR! Conference in Vienna. Giovanni Millo joined the project as coauthor shortly thereafter. Two years later, an accompanying paper to plm (Croissant & Millo, 2008) featured prominently in the econometrics special issue of the Journal of Statistical Software testifying the improved availability of econometric methods in R and the increased relevance of the R project for the profession.
More recently, Kevin Tappe has become the third author. Liviu Andronic, Arne Henningsen, Christian Kleiber, Ott Toomet, and Achim Zeileis importantly contributed to the package at various times. Countless users provided feedback, smart questions, bug reports, and, often, solutions.
Estimating the user base is no simple task, but the available evidence points at large and growing numbers. The 2008 paper describing an earlier version of the package has since been downloaded almost 100,000 times and peaked on Goggle Scholar's list as the 25th most cited paper in the Journal of Statistical Software, the leading outlet in the field, before hitting the five‐year reporting limit. At the time of writing, it counts over 400 citations on Google Scholar, despite the widespread bad habit of not citing software papers. The monthly number of package downloads from a leading mirror site has been recently estimated at 6,000.
Chapters 2, 3, 6, 7, and 8 have been written by Yves Croissant; 1, 5, 9 (except the first generation unit root testing section), and 10 by Giovanni Millo, chapter 4 being co‐written.
The book has been produced through Emacs+ESS (Rossini et al., 2004) and typeset in LaTeX using Sweave (Leisch, 2002) and later knitr (Xie, 2015). Plots have been made using ggplot2 (Wickham, 2009) and tikz (Tantau, 2013).
The companion package to this book is pder (Croissant & Millo, 2017); the methods described are mainly in the plm package (Croissant & Millo, 2008) but also in pglm (Croissant, 2017) and splm (Millo & Piras, 2012). General purpose tests and diagnostics tools of packages car (Fox & Weisberg, 2011), lmtest (Zeileis & Hothorn, 2002), sandwich (Zeileis, 2006b), and AER (Kleiber & Zeileis, 2008) have been used in the code, as have some more specialized tools available in MASS (Venables & Ripley, 2002), censReg (Henningsen, 2017), nlme (Pinheiro et al., 2017), survival (Therneau & Grambsch, 2000), truncreg (Croissant & Zeileis, 2016), pcse (Bailey & Katz, 2011), and msm (Jackson, 2011). dplyr (Wickham & Francois, 2016) has been used to work with data.frames and Formula with general formulas. stargazer (Hlavac, 2013) and texreg (Leifeld, 2013) were used to produce fancy tables, the fiftystater package (Murphy, 2016) to plot a United States map. The packages presented and the example code are entirely cross‐platform as being part of the R project.
3.129.210.17