Walter Zucchini, Oleg Nenadi´c

Contents

1 Getting started

1.1 Downloading and Installing R . . . . . . . . . . . . . . . . . . . .

1.2 Data Preparation and Import in R . . . . . . . . . . . . . . . . .

1.3 Basic R–commands: Data Manipulation and Visualization . . . .

2

2

2

3

2 Simple Component Analysis

2.1 Linear Filtering of Time Series . . . . . . . . . . . . . . . . . . . .

2.2 Decomposition of Time Series . . . . . . . . . . . . . . . . . . . .

2.3 Regression analysis . . . . . . . . . . . . . . . . . . . . . . . . . .

8

8

9

11

3 Exponential Smoothing

3.1 Introductionary Remarks . . . . . . . . . . . . . . . . . . . . . . .

3.2 Exponential Smoothing and Prediction of Time Series . . . . . . .

14

14

14

4 ARIMA–Models

4.1 Introductionary Remarks . . . . . . . . . . . . . . . . . .

4.2 Analysis of Autocorrelations and Partial Autocorrelations

4.3 Parameter–Estimation of ARIMA–Models . . . . . . . .

4.4 Diagnostic Checking . . . . . . . . . . . . . . . . . . . .

4.5 Prediction of ARIMA–Models . . . . . . . . . . . . . . .

17

17

17

18

19

20

A Function Reference

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

22

1

Chapter 1

Getting started

1.1

Downloading and Installing R

R is a widely used environment for statistical analysis. The striking difference between R and most other statistical software is that it is free software and that it is maintained by scientists for scientists. Since its introduction in 1996, the

R–project has gained many users and contributors, which continously extend the capabilities of R by releasing add–ons (packages) that offer previously not available functions and methods or improve the existing ones.

One disadvantage or advantage, depending on the point of view, is that R is used within a command–line interface, which imposes a slightly steeper learning curve than other software. But, once this burden hab been taken, R offers almost unlimited possibilities for statistical data analysis.

R is distributed by the “Comprehensive R Archive Network” (CRAN) – it is available from the url: http://cran.r-project.org. The current version of R (1.7.0, approx. 20 MB) for Windows can be downloaded by selecting “R binaries” →

“windows” → “base” and downloading the file “rw1070.exe” from the CRAN– website. R can then be installed by executing the downloaded file. The installation procedure is straightforward, one usually only has to specify the target directory in which to install R. After the installation, R can be started like any other application for Windows, that is by double–clicking on the corresponding icon. 1.2

Data Preparation and Import in R

Importing data into R can be carried out in various ways – to name a few, R offers means for importing ASCII and binary data, data from other applications or even for database–connections. Since a common denominator for “data analysis”

(cough) seem to be spreadsheet applications like e.g. Microsoft Excel c , the

2

CHAPTER 1. GETTING STARTED

3

remainder of this section will focus on importing data from Excel to R.

Let’s assume we have the following dataset tui as a spreadsheet in Excel. (The dataset can be downloaded from http://134.76.173.220/tui.zip as a zipped Excel– file). The spreadsheet contains stock data for the TUI AG from Jan., 3rd 2000 to

May, 14th 2002, namely date (1st column), opening values (2nd column), highest and lowest values (3rd and 4th column), closing values (5th column) and trading volumes (6th column).

A convenient way of preparing the data is to clean up the table within Excel, so that only the data itself and one row containing the column–names remain.

Once the data has this form, it can be exported as a CSV (“comma seperated values”) file, e.g. to C:/tui.csv.

After conversion to a CSV–file, the data can be loaded into R. In our case, the tui–dataset is imported by typing tui <- read.csv("C:/tui.csv", header=T, dec=",", sep=";") into the R–console. The right–hand side,