This is an RStudio Markdown document to accompany Eric Marland's presentation on the Tree Harvest data at the MAA Prep Workshop July 2012.

We start with some setup

```
require(mosaic)
trellis.par.set(theme = col.mosaic())
```

```
trees = fetchData("TreeHarvest.csv")
```

```
## Retrieving data from http://www.mosaic-web.org/go/datasets/TreeHarvest.csv
```

```
names(trees)
```

```
## [1] "time" "rate" "species"
```

We can calculate some summary statistics by species.

```
mean(rate ~ species, data = trees)
```

```
## black walnut loblolly pine mountain pond
## 0.965 1.030 1.829
```

```
favstats(rate ~ species, data = trees)
```

```
## min Q1 median Q3 max mean sd n missing
## black walnut 0.40 0.7475 1.000 1.212 1.35 0.965 0.3071 16 0
## loblolly pine 0.10 0.2700 0.510 1.562 4.30 1.030 1.1351 16 0
## mountain pond 0.65 1.1750 1.875 2.475 2.97 1.829 0.7720 16 0
```

Next we plot the relationship between rate and time, separately for each species.

```
plotPoints(rate ~ time | species, data = trees, layout = c(1, 3))
```

Let's focus on the mountain ponderosa pine species.

```
plotPoints(rate ~ time, data = subset(trees, species == "mountain pond"))
```

We can fit two models (one spline, one smoother) to these data, and compare the fit to the observed values. We also add the average value (in purple).

```
mp.spline = spliner(rate ~ time, data = subset(trees, species ==
"mountain pond"))
mp.smooth = smoother(rate ~ time, data = subset(trees, species ==
"mountain pond"))
plotFun(mp.spline(time) ~ time, time.lim = c(5, 155))
plotFun(mp.smooth(time) ~ time, add = TRUE, col = "red")
plotPoints(rate ~ time, add = TRUE, data = subset(trees, species ==
"mountain pond"))
# calculate average value
F <- antiD(mp.smooth(x) ~ x)
average <- makeFun(1/x * F(x) ~ x)
plotFun(average(x) ~ x, add = TRUE, xlim = c(0, 155), col = "purple")
```