Tree Harvests in RStudio using the mosaic package

This is an RStudio Markdown document to accompany Eric Marland's presentation on the Tree Harvest data at the MAA Prep Workshop July 2012.

We start with some setup

require(mosaic)
trellis.par.set(theme = col.mosaic())
trees = fetchData("TreeHarvest.csv")
## Retrieving data from http://www.mosaic-web.org/go/datasets/TreeHarvest.csv
names(trees)
## [1] "time"    "rate"    "species"

We can calculate some summary statistics by species.

mean(rate ~ species, data = trees)
##  black walnut loblolly pine mountain pond 
##         0.965         1.030         1.829 
favstats(rate ~ species, data = trees)
##                min     Q1 median    Q3  max  mean     sd  n missing
## black walnut  0.40 0.7475  1.000 1.212 1.35 0.965 0.3071 16       0
## loblolly pine 0.10 0.2700  0.510 1.562 4.30 1.030 1.1351 16       0
## mountain pond 0.65 1.1750  1.875 2.475 2.97 1.829 0.7720 16       0

Next we plot the relationship between rate and time, separately for each species.

plotPoints(rate ~ time | species, data = trees, layout = c(1, 3))

plot of chunk unnamed-chunk-4

Let's focus on the mountain ponderosa pine species.

plotPoints(rate ~ time, data = subset(trees, species == "mountain pond"))

plot of chunk unnamed-chunk-5

We can fit two models (one spline, one smoother) to these data, and compare the fit to the observed values. We also add the average value (in purple).

mp.spline = spliner(rate ~ time, data = subset(trees, species == 
    "mountain pond"))
mp.smooth = smoother(rate ~ time, data = subset(trees, species == 
    "mountain pond"))
plotFun(mp.spline(time) ~ time, time.lim = c(5, 155))
plotFun(mp.smooth(time) ~ time, add = TRUE, col = "red")
plotPoints(rate ~ time, add = TRUE, data = subset(trees, species == 
    "mountain pond"))
# calculate average value
F <- antiD(mp.smooth(x) ~ x)
average <- makeFun(1/x * F(x) ~ x)
plotFun(average(x) ~ x, add = TRUE, xlim = c(0, 155), col = "purple")

plot of chunk unnamed-chunk-6