UnitIRCHstudyguide.pdf
March 30, 2022
Untitleddocument.pdf
March 30, 2022
Show all

Chapter12RCH.docx

1

This chapter introduces R and the R Commander, explaining what they are and where they came from. The chapter also outlines the contents of the book, shows how to access the web site for the book, and describes the typographical conventions used in the text.

The R Commander—the subject of this book—is a point-and-click graphical user interface (GUI) for R, allowing you to use R statistical software through familiar menus and dialog boxes instead of by typing commands. Throughout the book, I assume that the statistical methods covered are familiar to you—or that you’re concurrently learning them in a statistics class or by independently reading a complementary statistics text. The object of the book is to show you how to perform data analysis with the R Commander employing common statistical methods, not to teach the statistical methods themselves.

An implication of this approach is that you should feel free to skip those parts of the book that take up statistical methods with which you’re unfamiliar. For example, most of the material in , on working with statistical models in the R Commander, is beyond the level of a typical basic statistics course. Sections that deal with relatively advanced or difficult material are marked with asterisks.

R is highly capable, free, open-source statistical software. Although it is hard to know with any certainty how many people use R, it is—for example, judging by Internet traffic—possibly the most popular statistical software in the world. R is, in any event, very widely used, and its use is growing rapidly!

R incorporates a programming language that is finely adapted to the development of statistical applications. Rdescends from the S programming language, originally developed in the 1980s by statisticians and computer scientists at Bell Labs, led by John Chambers (see, e.g., Becker et al., 1988). Indeed, R can be regarded as a dialect of S. Eventually incorporated in a commercial product called S-PLUS, S was popular among statisticians prior to the development of R. At present, the free, open-source R has entirely eclipsed its commercial cousin S-PLUS.

R is free software in Richard Stallman’s famously dual sense of the term (Stallman, 2002): It is free in the obvious sense of being costless, but also in the deeper sense that users may freely modify and distribute R. Moreover, R is licensed under the Free Software Foundation’s General Public License (GPL)—a “viral copy-left” that prevents individuals or companies from restricting users’ freedom to modify further and redistribute R. Freedom in the second sense essentially presupposes that R is open source: that is, that R is distributed not only as an executable program but also that the source code for R (written in a variety of programming languages, including in R itself) is available to interested users. For more information about R, visit the R web site at . The R Commander is also free, open-source software distributed under the GPL.

Analyzing data with R doesn’t necessarily entail writing programs in the R language, because the basic Rdistribution comes with impressive built-in statistical functionality. The capabilities of the standard R distribution, however, are greatly extended by (as I write this) nearly 8000 user-contributed R add-on packages, freely available on the Internet through the Comprehensive R Archive Network (abbreviated CRAN, and alternatively pronounced as “kran” or “see-ran”; see ). Moreover, roughly 1000 additional R packages are available through the closely associated Bioconductor Project (), which develops software primarily for bioinformatics (genonomics).

Whether you write your own R programs or use pre-packaged programs, standard data analysis in R consists of typing commands in the R language. As a simple example, to compute the mean of the variable income, you might type the command mean(income), invoking the standard R mean function (program). Similarly, to perform a linear least-squares regression of income on years of education and years of labor-force experience, you might issue the command lm(income ~ education + experience), employing the lm (linear-model) function. Learning to write R commands like these is an important skill and ultimately is the most efficient way to use R (see ), but it can present a formidable obstacle to new, occasional, or casual users of R.

R began around 1990 as the personal project of Robert Gentleman and Ross Ihaka, two statisticians then at Auckland University in New Zealand (see Ihaka and Gentleman, 1996). Gentleman and Ihaka in effect grafted the syntax of the pre-existing statistical programming language S onto the Scheme dialect of Lisp, a programming language usually associated with work in artificial intelligence. This turned out to be a propitious choice, because, as mentioned, the Slanguage was already widely used by statisticians.

Eventually Ihaka and Gentleman advertised their work on the Internet, attracting several other developers to the project, including John Chambers, the principal developer of S. Then, in 1997, the R Project for Statistical Computing was formalized, with a Core team of nine developers, a number that has since expanded to 20, many of whom are significant figures in the field of statistical computing. The R Core team is responsible for the continued development and maintenance of the basic R distribution.

As I explained, R is distributed under the free-software General Public License. The copyright to R is held by the R Foundation, which comprises the members of the R Core team along with about a dozen other individuals; I’m an elected member of the R Foundation.

The growth of R has been nothing short of amazing.  shows, for example, the expansion of the CRAN Rpackage archive over the 14-year period for which I was able to obtain data. The horizontal axes of the graph record R versions and corresponding dates, while the vertical axes show the number of CRAN packages on a logarithmic scale, so that a linear trend represents exponential growth. The line on the graph was fit by least-squares regression.You can see from  that, while the growth of CRAN was originally approximately exponential, its rate of growth has more recently slowed down. The slope of the least-squares line suggests that, on average over this period, CRAN expanded at a rate of about 35 percent a year.

Image

FIGURE 1.1: The growth of CRAN. The vertical axes show the number of CRAN packages on a log scale, while dates and corresponding minor R versions are shown at the bottom and top of the graph, respectively. The line on the graph was fit to the points by least-squares regression. Two R versions, 1.6 and 2.14, are omitted because their recorded dates were very close to the dates of the previous versions.

Source: Updated from Fox, “Aspects of the social organization and trajectory of the R Project,” The R Journal, 1(2): 5–13, 2009.

Source of Data: Downloaded from  on 2016-03-03.

As I mentioned in the Preface to this book, I began to work on the R Commander around 2002, and I contributed version 0.8-2 of the Rcmdr package to CRAN in May 2003. The first “non-beta” version, 1.0-0, appeared two years later, and was described in a paper in the Journal of Statistical Software (Fox, 2005), an on-line journal of the American Statistical Association. In March 2016, when I wrote the chapter you’re reading, that paper had been downloaded nearly 140,000 times—despite the fact that it was more than 10 years out of date!

I have continued to develop the R Commander in the intervening period: Version 1.1-1, which also appeared in 2005, introduced the capability to translate the R Commander interface into other languages, a feature supported by R itself, and there are now 18 such translations (counting Chinese and simplified Chinese as separate translations). In 2007,  Version 1.3-0 first made provision for R Commander plug-in packages, and there are currently about 40 such plug-ins on CRAN. In 2013, Milan Bouchet-Valat joined me as a developer of the R Commander, and version 2.0-0, released in that year, featured a revamped, more consistent interface—for example, featuring tabbed dialogs.

 describes how to download R from the Internet and install it and the R Commander on Windows, Mac OS X, and Linux/Unix systems. If you have already successfully installed R and the R Commander, then feel free to skip the chapter. There is, however, some troubleshooting information, to which you can make reference if you experience a problem.

 introduces the R Commander graphical user interface (GUI) by demonstrating its use for a simple problem: constructing a contingency table to examine the relationship between two categorical variables. In developing the example, I explain how to start the R Commander, describe the structure of the R Commanderinterface, show how to read data into the R Commander, how to modify data to prepare them for analysis, how to draw a graph, how to compute numerical summaries of data, how to create a printed report of your work, how to edit and re-execute commands generated by the R Commander, and how to terminate your R and R Commander session—in short, the typical work flow of data analysis using the R Commander. I also explain how to customize the R Commander interface.

 shows how to get data into the R Commander from a variety of sources, including entering data directly at the keyboard, reading data from a plain-text file, accessing data stored in an R package, and importing data from an Excel or other spreadsheet, or from other statistical software. I also explain how to save and export R data sets from the R Commander, and how to modify data—for example, how to create new variables and how to subset the active data set.

 explains how to use the R Commander to compute simple numerical summaries of data, to construct and analyze contingency tables, and to draw common statistical graphs. Most of the statistical content of the chapter is covered in a typical basic statistics course, although a few topics, such as quantile-comparison plots and smoothing scatterplots, are somewhat more advanced.

 shows how to compute simple statistical hypothesis tests and confidence intervals for means, for proportions, and for variances, along with simple nonparametric tests, a test of normality, and correlation tests. Many of these tests are typically taken up in a basic statistics class, and, in particular, tests and confidence intervals for means and proportions are often employed to introduce statistical inference.

 explains how to fit linear and generalized linear regression models in the R Commander, and how to perform additional computations on regression models once they have been fit to data.

 explains how to use the R Commander to perform computations on probability distributions, to graph probability distributions, and to conduct simple random simulations.

 explains how to use R Commander plug-in packages. The capabilities of the R Commander are substantially augmented by the many plug-in packages for it that are available on CRAN. Plug-ins are R packages that add menus, menu items, and dialog boxes to the R Commander. I show you how to install plug-in packages, and illustrate the application of R Commander plug-ins by using the RcmdrPlugin.TeachingDemos package and the RcmdrPlugin.survival package as examples.

An appendix to the book displays the complete set of R Commander menus, along with cross-references to the text.

If you become a frequent user of R, you’ll likely graduate from the R Commander to writing your own R commands and possibly your own R programs. There are several reasons to employ the command-line interface to R in preference to a GUI like the R Commander:

•  The R Commander GUI provides access to only a small fraction of the capabilities of R and the many R packages available on CRAN. To take full advantage of R, therefore, you’ll have to learn to write commands.

•  Even if you limit yourself to the capabilities in the R Commander and its various plug-ins, frequent users of R find the command-line interface more efficient. Once you remember the various commands and their arguments, you’ll learn to work more quickly at the command line than in a GUI.

•  You’ll find that a little bit of programming goes a long way. Writing simple scripts and programs is often the quickest and most straightforward way to perform data management tasks, for example.

If you do decide to learn to use R via the command-line interface, there are many books and other resources to help you. For example, I and Sanford Weisberg have written a text (Fox and Weisberg, 2011) that introduces R, including Rprogramming, in the context of applied regression analysis. See the Documentation links on the R home page at  for many alternative sources, including free resources.

The R Commander is designed to facilitate the transition to command-line use of R: The commands produced by the R Commander are visible in the R Script tab. The contents of the R Script tab may be saved to a file and reused, either in the R Commander or in an R programming editor. Similarly, the dynamic document produced in the R Commander R Markdown tab may be saved, edited, and executed independently of the R Commander. These features are briefly discussed in .

Both the Windows and Mac OS X implementations of R come with simple programming editors, but I strongly recommend the RStudio interactive development environment (IDE) for command-line use of R. RStudioincorporates a powerful programming editor and is ideal both for routine data analysis in R and for R programming, including the development of R packages—and RStudio supports R Markdown documents. Like R and the R Commander, RStudio is free, open-source software: Visit the RStudio web site at  for details, including extensive documentation.

I have created a web site to support this book with a variety of resources, including:

•  all the data files used in examples that appear in the text

•  detailed (and potentially updated) installation instructions, including trouble-shooting information, beyond the instructions in 

•  information about significant updates to the R Commander following the publication of this book, along with errata correcting errors in the book (as they, almost inevitably, reveal themselves)

•  a manual for authors of R Commander plug-in packages

A note about software versions: Although some of the “screenshots” and output in this book were produced with earlier versions, the book is current as of version 3.2.3 of R and version 2.1-4 of the Rcmdr package. Significant changes to the R Commander or changes to R that affect the R Commander will be addressed on the web site for the book.

Chapman and Hall maintains a link to the web site for the book at , which can also be accessed at .

Different typefaces and fonts are used to distinguish the following elements:

•  Computer software, such as operating systems (Windows, Mac OS X, Linux) and statistical software (R, the R Commander) are shown in a sans serif typeface.

•  Graphical user interface components, such as menus (the Edit menu) and windows (the R Console window, the Two-Way Table dialog box), are shown in an italic typeface.

•  Submenu and menu item selection is indicated by > (a greater than sign). Thus, for example, Statistics > SummariesNumerical summaries… means “left-click on the Statistics menu, then on the Summaries submenu, and finally on the Numerical summaries… menu item.” Three dots following a menu item (as in Numerical summaries…) indicates that selecting the item leads to a dialog box, rather than performs a direct action. I will usually omit the three dots from menu items, however.

•  Keys (e.g., Tab) and key combinations (Ctrl-c) are also shown in an italic typeface. The key combination Ctrl-c, for example, means “hold down the Ctrl key and press the c key.”

•  R packages (such as the Rcmdr and car packages) are shown in boldface.

•  Text meant to be typed directly (such as the R command library(Rcmdr), or text to be typed into an R Commander dialog box) is shown in a typewriter font, as is R output, and as are the names of R objects, such as functions (mean), data sets (States), and variables in data sets (pay). Generic text (e.g., variable-name) meant to be replaced with specific text (e.g., income) is shown in an italic typewriter font. •  Files (GSS.csv) and file paths (C:Program FilesRR-x.y.z) are also shown in typewriter font, with generic text again in typewriter italics.

•  Internet URLs (addresses) are shown in a sans serif typeface (e.g., ).

•  When important, possibly unfamiliar, terms are introduced, they are set in italics (e.g., rectangular data set).

Richard Stallman is the founder of the Free Software Foundation, with which the R Project for Statistical Computing is associated.

This graph is updated from Fox (2009), where I discuss the social organization and trajectory of the R Project.

If you’re unfamiliar with logs, don’t be concerned: The essential point is that the scale gets more compressed as the number of packages grows, so that, for example, the distance between 100 and 200 packages on the log scale is the same as the distance between 200 and 400, and the same as the distance between 400 and 800—all of these equal distances represent doubling the number of packages.

Again, if you’re unfamiliar with the method of least squares, don’t worry: You’ll almost surely study the topic in your basic statistics course. The essential idea is that the line comes as close (in a sense) to the points on average as possible.

This chapter describes how to download R from the Internet and install it and the R Commander on Windows, Mac OS X, and Linux/Unix systems. If you have already successfully installed R and the R Commander, then feel free to skip the chapter. There is, however, some troubleshooting information, to which you can make reference if you experience a problem.

More detailed (and potentially more up-to-date) information on installing R and the Rcmdr package appears on the web site for this book. Please consult the web site if the information provided here proves insufficient, or if you encounter difficulties not discussed here.

R and R packages, like the Rcmdr package, are available on the Internet from CRAN (the Comprehensive R Archive Network—see ) at . It is best not to download R and R packages directly from the main CRAN web site, however, but rather to use a CRAN mirror site. A link to a list of CRAN mirrors appears at the upper left of the CRAN home page (the top of which is shown in ). I suggest that you use the first “0-Cloud” mirror, which is generally both reliable and fast.

Regardless of whether you are a Windows user, a Mac OS X user, or a Linux/Unix user, I recommend that you install the current version of R, say R version x.y.z. In this generic version number, “x” represents the major version, “y” the minor version, and “z” the patch version of R. A new minor version of R, x.y.0, is released by the R Core team each spring, and patch versions are released as needed, typically to fix bugs. Major versions appear infrequently, and only when substantial modifications are made to the base R software. As I’m writing this book, the current version of R is 3.2.3.

There are (at most) five steps to installing R and the R Commander:

1.  Download and install R.

2.  On Mac OS X only, download and install the XQuartz windowing software.

3.  Start R and install the Rcmdr package.

4.  Load the Rcmdr package and, when asked, allow it to install additional packages.

5.  If desired, optionally download and install Pandoc and LATEX software for producing enhanced reports (as described in ).

Specific instructions for Windows, Mac OS X, and Linux/Unix systems follow.

Image

FIGURE 2.1: The top of the Comprehensive R Archive Network (CRAN) home page (on 2015-10-04).

From the home page of the mirror you selected, click on the link Download R for Windows, which appears near the top of the page. Then click on install R for the first time, and subsequently on Download R x.y.z for Windows.

Once it is downloaded, double-click on the R installer. On Windows 10, you will see a frightening message: “Windows protected your PC. Windows SmartScreen prevented an unrecognized app from starting. Running this app might put your PC at risk.” Click on More info and then on the Run anyway button. (Why not live dangerously?) Windows issues this warning if your user-account controls are at the default settings and you download software that’s not from the Windows App Store.

You may take all of the defaults in the R installer, but I suggest that you make the following modifications:

•  Instead of installing R in the standard location, typically C:Program FilesRR-x.y.z (for R version x.y.z), you may use C:RR-x.y.z. This will allow you to install packages in the main R package library on your computer without running R with administrator privileges.

If you install R into C:Program FilesRR-x.y.z, and you run R without administrator privileges, you will see a message like the following:

Warning in install.packages(“Rcmdr”) :‘lib = “C:/Program Files/R/R-x.y.z/library”’ is not writable

and R will ask to install packages into a personal library under your Windows user account. That works too, but the installed packages will be available only to your account.

•  On R for Windows, the R Commander works best with the “single-document interface” (or SDI). Under the default “multiple-document interface” (MDI), the main R Commander window and its various dialog boxes will not be contained in the master R window and may not stay on top of this window.

In the Startup options screen of the R installer, select Yes (customized startup). Then select the SDI (single-document interface) in preference to the default MDI (multiple-document interface); feel free to make other changes, but you may take all the remaining defaults.

The key steps in the installation process, where I recommend that you make non-default choices, are shown in .

Once it is installed, start R in the standard manner—for example, by double-clicking on its desktop icon, or by selecting it from the Windows start menu. If you are running R on a 64-bit Windows computer (and almost all current computers are 64-bit), both 64-bit and 32-bit versions of R will be installed. You can use either, but I suggest that you use the 64-bit version, and feel free to delete the 32-bit R icon from your desktop.

The easiest way to install the Rcmdr package, if you have an active Internet connection, is via the Packages > Install package(s) menu in the R Console (that is, click on the Packages menu, select the menu item Install package(s), and pick the Rcmdr package from the long alphabetized list of R packages available on CRAN), or via the command install.packages(“Rcmdr”) typed at the > command prompt in the R Console (followed by pressing the Enter key). In either case, R will ask you to select a CRAN mirror for the session; I suggest that you again pick the first “0-Cloud” mirror. R will install the Rcmdr package, along with a number of other R packages that the R Commander requires to get started.

When you first load the Rcmdr package with the command library(Rcmdr), it will offer to download and install additional packages; allow it to do so.

Installing R and the R Commander on Windows systems is generally straightforward. Occasionally, and unpredictably, an R package required by the Rcmdr package fails to be installed—possibly because the package is missing from the CRAN mirror that you used—and the R Commander can’t start. Under these circumstances, there is typically an informative error message about the missing package.

The simple solution is to install the missing package (or packages, if there are more than one) directly. For example, if the car package is missing, you can install it via the R command install.packages(“car”), or from the R Console Packages menu, possibly selecting a different CRAN mirror from the one that you used initially.

Sometimes when users save the R workspace upon exiting from R, the R Commander will fail to work properly in a subsequent session. As I will explain in , I recommend never saving the R workspace to avoid these kinds of problems, and if you exit from R via the R Commander menus (File > Exit > From Commander and R), the R workspace will not be saved.

Image

FIGURE 2.2: Key steps in installing R for Windows (illustrated with the R for Windows version 3.2.3 installer): installing R into the directory C:RR-3.2.3 (top); selecting customized startup options (middle); selecting the SDI (bottom).

If, however, you inadvertently saved the workspace in a previous session, it will reside in a file named .RData. To discover where this file is located, enter the command getwd() (“get working directory”) at the R > command prompt as soon as R starts up.

If Windows is configured to suppress known file types (also called file extensions), as it is by default, then you will not see the name of this file in the Windows File Explorer, because the file name begins with a period (.) and thus, as far as Windows is concerned, consists only of a file type. You can, however, still delete the file by right-clicking the R icon with no file name that appears at the top of the alphabetized Name column in the File Explorer window, selecting Delete from the resulting context menu. Be careful not to delete any other (named) files associated with the R icon.

Before installing R, make sure that your Mac OS X system is up-to-date by running Software Update from the “Apple” menu at the top-left of the screen. This is important, because R assumes that your system is up-to-date and may not function properly if it is not.

From the home page of the CRAN mirror you selected, click on the link Download R for (Mac) OS X, which appears near the top of the page; then click on R-x.y.z.pkg. Once it is downloaded, double-click on the R installer. You may take all of the defaults.

The initial screen of the R for Mac OS X installer is shown at the top of .

Background: The R Commander uses the tcltk package, which is a standard component of the R distribution. On Mac OS X systems, R also installs a version of the Tcl/Tk GUI builder, used by the tcltk package. This version of Tcl/Tk in turn uses the Unix X11 windowing system (also called X Windows) instead of the standard Mac Quartz windowing system.

Some older versions of Mac OS X had X11 pre-installed, while other older versions came with X11 on the operating system installation disks. X11 is absent from newer versions of Mac OS X, but is readily available on the Internet at the XQuartz web site, .

I suggest that you simply install the current version of XQuartz, whether or not an older version of X11 is installed on your computer:

•  Download the disk-image file (XQuartz-x.y.z.dmg) for the current version x.y.z of XQuartz.

•  When you open this file by double-clicking on it, you’ll find XQuartz.pkg; double-click on XQuartz.pkg to run the XQuartz installer, clicking through all the defaults. The initial screen of the XQuartz installer is shown at the bottom of .

•  After the installer runs, you’ll have to log out of and back into your Mac OS X account—or just reboot your computer. You can remove the XQuartz disk image from your desktop by dragging it and dropping it on the Mac OS X trash can.

Image

FIGURE 2.3: The initial screens of the R for Mac OS X installer (version 3.2.3, top) and the XQuartz installer (version 2.7.8, bottom).

If you subsequently upgrade Mac OS X (e.g., from version 10.10 to version 10.11), you will have to reinstall XQuartz (and possibly R itself), even if you installed it previously.

Once R and X11 are installed, start R in the standard manner—for example, by double-clicking on the R icon in the Applications folder.

The easiest way to install the Rcmdr package, if you have an active Internet connection, is via the Packages & Data menu in the R Console: Click on the Packages & Data menu and select the Package Installer menu item.

•  Type Rcmdr in the Package Search box, and click the Get List button.

•  R will ask you to select a CRAN mirror; as before, I suggest that you pick the first “0-Cloud” mirror, and that, when asked, you opt to set the selected mirror as the default.

•  Click on the Rcmdr package in the resulting packages list; check the Install Dependencies box, and click on the Install Selected button.

•  Once the Rcmdr package and its dependencies are installed, which may take a bit of time, you can close the R Package Installer window.

An alternative to using the R menus is to type install.packages(“Rcmdr”) at the > command prompt in the R Console (followed by pressing the return or enter key). R will install the Rcmdr package, along with a number of other R packages that the R Commander requires to get started.

When you first load the Rcmdr package with the command library(Rcmdr), it may offer to download and install additional packages; if so, allow it to do so.

On some versions of Mac OS X, you may see an additional message from R the first time that you load the Rcmdrpackage: “The ‘otool’ command requires the command line developer tools. Would you like to install the tools now?” If you see this message, click the Install button in the message dialog box.

Under Mac OS X 10.9 (“Mavericks”) or later, the R Commander may slow down or occasionally hesitate to display a menu as your session progresses. This behavior is due to Mac OS X saving power by going into “nap” mode (called app nap) when the R.app window is not visible.

I am aware of several solutions (beyond inconveniently insuring that the top of the R.app window is always visible). The simplest solution is to suppress app nap via the R Commander menus: Tools > Manage Mac OS X app nap for R.app. That is, choose the menu item Manage Mac OS X app nap for R.app from the R Commander Toolsmenu. In the resulting dialog, click the radio button to set app nap off. This setting is permanent across R.appsessions until you change it.

For alternative solutions, see the web site for the book.

Occasionally, the Rcmdr package will fail to load properly in Mac OS X. When this problem occurs, the cause is almost always the failure of the tcltk package to load. The problem is usually clearly stated in an error message printed in the R console. You can confirm the diagnosis by trying to load the tcltk package directly in a fresh Rsession, issuing the command library(tcltk) at the R command prompt.

The solution is almost always to install, or reinstall, XQuartz (and possibly R), as described above, remembering to log out of and back into your account before trying to run R and the R Commander again. If this solution fails, then you can consult the more detailed troubleshooting information in the Mac OS X installation notes on the web site for the book.

Beyond the failure of the tcltk package to load, occasionally, and unpredictably, an R package required by the Rcmdr package fails to be installed—possibly because the package is missing from the CRAN mirror that you used—and the R Commander can’t start. Under these circumstances, there is typically an informative error message about the missing package.

The simple solution is to install the missing package (or packages, if there are more than one) directly. For example, if the car package is missing, you can install it via the R command install.packages (“car”), or from the R.app Packages menu, possibly selecting a different CRAN mirror from the one that you used initially.

Sometimes when users save the R workspace upon exiting from R, the R Commander will fail to work properly in a subsequent session. As I will explain in , I recommend never saving the R workspace to avoid these kinds of problems, and if you exit from R via the R Commander menus (File > Exit > From Commander and R), the R workspace will not be saved.

If, however, you inadvertently saved the workspace in a previous session, it will reside in a file named .RData. To discover where this file is located, enter the command getwd() (“get working directory”) at the R > command prompt as soon as R.app starts up; this will typically be your home directory.

Unfortunately, newer versions of Mac OS X don’t make it easy for you to view the contents of your home directory in the Finder. Instead, run the Mac OS X Terminal program; you’ll find Terminal in the Mac OS XUtilities subfolder within the Applications folder. Type the command ls -a at the Terminal $ command prompt (followed by pressing the enter or return key) to list all files in your home directory. Among these files, you should see .RData. Then type the command rm .Rdata to remove the offending file.

R is available from CRAN for several Linux distributions (Debian, RedHat, SUSE, and Ubuntu); select your distribution, and proceed as directed.

If you have a Linux or Unix system that’s not compatible with one of these distributions, then you will have to compile R from source code; the procedure for doing so is described in the R FAQ (“frequently asked questions”) list at  (Question 2.5.1, at the time of writing).

Once R is installed, you will have to install the Rcmdr package and its dependencies. Start R and type the command install.packages(“Rcmdr”) at the > command prompt (followed by pressing the Return or Enterkey). You may be asked to select a CRAN mirror site; as before, I suggest that you pick the first “0-Cloud” mirror. After the Rcmdr and its direct package dependencies are installed, start the R Commander via the command library(Rcmdr). The R Commander will ask to install some additional packages; let it do that.

Occasionally and unpredictably, an R package required by the Rcmdr package fails to be installed—possibly because the package is missing from the CRAN mirror that you used—and the R Commander can’t start. Under these circumstances, there is typically an informative error message about the missing package.

The simple solution is to install the missing package (or packages, if there are more than one) directly. For example, if the car package is missing, you can install it via the R command install.packages(“car”), possibly selecting a different CRAN mirror from the one that you used initially.

Sometimes when users save the R workspace upon exiting from R, the R Commander will fail to work properly in a subsequent session. As I will explain in , I recommend never saving the R workspace to avoid these kinds of problems, and if you exit from R via the R Commander File > Exit > From Commander and R menu, the Rworkspace will not be saved.

If, however, you inadvertently saved the workspace in a previous session, it will reside in a file named .RData. To discover where this file is located, enter the command getwd() (“get working directory”) at the R > command prompt as soon as R starts up; this will typically be your home directory.

Type the command ls -a at the command prompt in a fresh Linux terminal to list all files in your home directory. Among these files, you should see .RData. Then type the command rm .Rdata to remove the offending file.

You may also find that you are missing a C or Fortran compiler, required by R to build packages, or an installation of Tcl/Tk, required by the tcltk package, which is in turn used by the R Commander. If you experience these or other difficulties, consult the R FAQ (“frequently asked questions”) at , and the R Installation and Administration manual at  (particularly  on essential and useful programs).

Installing R and the Rcmdr package is sufficient for creating HTML (web page) reports (see ), but if you prefer to create editable Word documents or PDF files for reports, you must additionally install Pandoc and LATEX (the latter, in conjunction with Pandoc, is needed only for PDF reports). The most convenient way to do this is via the R Commander menus: Tools > Install auxiliary software.

The book, of course, was written over a period of time; R 3.2.3 was current when I was finalizing the text.

To clarify, the R Commander works with both the SDI and the MDI, but it is more convenient to use it with the SDI.

Occasionally, on Windows 8 systems, the 64-bit version of R seems incompatible with viewing HTML files for reports (as described in ). In these cases, you can use the 32-bit version of R in preference to the 64-bit version. The principal advantage of the 64-bit version of R is that it permits the analysis of larger data sets, but it is unusual to use the R Commander to analyze bigger data sets than can be accommodated by the 32-bit version of R.

If you have an older version of Mac OS X, you may not be able to use the current version of R, but an older version of R compatible with your operating system may be provided: Read the information on the R for Mac OS X web page before downloading the R installer.

Although it is potentially confusingly named, XQuartz is an implementation of the X11 windowing system for Mac OS X.

Pandoc is a flexible program for converting documents from one format to another, while LATEX is technical typesetting software—this book, for example, is typeset with LATEX. Both Pandoc and LATEX are open-source software.

The Install auxiliary software menu item appears in the Tools menu only if Pandoc or LATEX is missing.

Leave a Reply

Your email address will not be published. Required fields are marked *