This is an RStudio Markdown document to accompany Eric Marland's presentation on the Tree Harvest data at the MAA Prep Workshop July 2012.
We start with some setup
require(mosaic)
trellis.par.set(theme = col.mosaic())
trees = fetchData("TreeHarvest.csv")
## Retrieving data from http://www.mosaic-web.org/go/datasets/TreeHarvest.csv
names(trees)
## [1] "time" "rate" "species"
We can calculate some summary statistics by species.
mean(rate ~ species, data = trees)
## black walnut loblolly pine mountain pond
## 0.965 1.030 1.829
favstats(rate ~ species, data = trees)
## min Q1 median Q3 max mean sd n missing
## black walnut 0.40 0.7475 1.000 1.212 1.35 0.965 0.3071 16 0
## loblolly pine 0.10 0.2700 0.510 1.562 4.30 1.030 1.1351 16 0
## mountain pond 0.65 1.1750 1.875 2.475 2.97 1.829 0.7720 16 0
Next we plot the relationship between rate and time, separately for each species.
plotPoints(rate ~ time | species, data = trees, layout = c(1, 3))
Let's focus on the mountain ponderosa pine species.
plotPoints(rate ~ time, data = subset(trees, species == "mountain pond"))
We can fit two models (one spline, one smoother) to these data, and compare the fit to the observed values. We also add the average value (in purple).
mp.spline = spliner(rate ~ time, data = subset(trees, species ==
"mountain pond"))
mp.smooth = smoother(rate ~ time, data = subset(trees, species ==
"mountain pond"))
plotFun(mp.spline(time) ~ time, time.lim = c(5, 155))
plotFun(mp.smooth(time) ~ time, add = TRUE, col = "red")
plotPoints(rate ~ time, add = TRUE, data = subset(trees, species ==
"mountain pond"))
# calculate average value
F <- antiD(mp.smooth(x) ~ x)
average <- makeFun(1/x * F(x) ~ x)
plotFun(average(x) ~ x, add = TRUE, xlim = c(0, 155), col = "purple")